A guide to machine learning mastery, with strategies and roadmaps
[Copy link]
Recently, a few friends have asked me through Twitter private messages about how to become a machine learning engineer. Although I am not a top expert like Andrej Karpathy, I have some experience to share as a mid-level machine learning engineer who has worked hard to get to where I am now, and my transition from amateur to professional happened not long ago.
If you’re thinking of taking this path, I hope to provide you with a practical roadmap and useful resources to help you transition or build a foundation for your career. I hope my advice will inspire you.
Before blindly following any roadmap, let’s answer a question: What is a machine learning engineer?
There has always been confusion about job titles in machine learning and related fields. For the purpose of this guide, we define machine learning engineers as professionals who work within an organization and apply machine learning to solve business problems . Their job goal is to create or improve products or make the organization's operations more efficient. This is different from machine learning researchers, who are primarily focused on developing new methodologies and conducting scientific research rather than directly solving immediate business needs.
Although there is a significant overlap between the two roles, since I work as a machine learning engineer, I will mainly discuss this role here.
Essential Skills for Machine Learning Engineers
Machine learning engineering is an interdisciplinary profession that requires you to master skills from different fields, mainly software engineering, data science and mathematics, as well as knowledge in some specific application areas.
Software Engineering
Machine learning engineers must not only be able to "program", but also be excellent software engineers for the following reasons:
-
First, the core of machine learning is to discover patterns from data, which requires machine learning engineers to be able to process large amounts of data, often far beyond the scope of human processing capabilities.
-
Secondly, the effectiveness of machine learning engineers is measured by business impact. Therefore, they need to have the ability to deploy models and integrate them into the overall product framework. If they cannot provide users with practical and usable services, it means that their work has failed.
-
Finally, a deep understanding of how computers work and the ability to build custom tools can greatly improve development efficiency. Intuition and domain knowledge can indeed be of great help when designing models, but the process of building machine learning models is essentially a process of trial and error. Machine learning engineers need to try multiple empirically based guesses and explore which method works best. The faster this process is and the higher the iteration rate, the better the final results will be. Solid software engineering skills can help achieve automation, speed up iterations, and make each experiment more efficient and effective.
By the way, I’m not the only one making this point. Let’s hear from more established experts, like Greg Brockman, CTO of OpenAI:
With a few exceptions, the most influential people in the field of artificial intelligence tend to be experts who are proficient in both software and machine learning . Although most people may think the opposite, it is usually much faster to learn machine learning than to learn software engineering. Therefore, excellent software engineers often have great development potential in the field of artificial intelligence.
— Greg Brockman (@gdb)
August 19, 2023
Ultimately, machine learning is a branch of computer science, and software engineering is the key way to transform computer science theory into effective applications .
Data Science
The core of machine learning is to discover patterns from data, so machine learning engineers must be proficient in data processing. They need to have the ability to handle complex and chaotic data in the real world, master methods of data collection and understanding, be able to extract useful features, and accurately interpret the rationality of model output.
In actual work, the most difficult problems are often not those obvious technical errors (such as memory overflow), but those hidden logical errors. For example, the model training is completed smoothly, and the output results seem correct, but in fact there are deviations in some key but subtle aspects. Experienced data scientists know that the most effective way to build an excellent model is to delve deeply into the data itself.
In addition, machine learning engineers also need to have solid research capabilities. They must be able to quickly find academic literature related to current problems and be able to reproduce these research results and apply them to their own fields.
Mathematics and Statistics
It is not easy to measure the level of mathematical skills required by machine learning engineers. Although they may not frequently apply complex mathematical knowledge directly in their daily work, a solid mathematical foundation is essential for understanding data characteristics and algorithm principles. Therefore, mathematical ability is actually a necessary skill for machine learning engineers.
So, what specific mathematical knowledge is needed? Generally speaking, machine learning engineers need to master the basics of real calculus, linear algebra, and probability theory. This knowledge helps them understand the principles of optimization algorithms, how to implement them, and the meaning of model outputs. When dealing with large-scale models or massive data sets, knowledge of numerical methods and optimization theory will also come in handy. In addition, statistical knowledge is also indispensable for a deep understanding of data characteristics.
Application Areas
Although machine learning can be viewed as a general toolbox, machine learning engineers often achieve better results if they have a deep understanding of specific application areas. This includes two aspects: first, they need to fully understand the specific purpose of the project, the target user group, and the available data resources. Second, they also need to accumulate expertise in processing specific types of data and choosing appropriate models.
For example, when processing text data, they may focus on language models; when processing image data, convolutional neural networks (CNN) may be the first choice; and when analyzing time series data, recurrent neural networks (RNN) may be more applicable. Different data types and application scenarios often require different expertise and technical methods.
Two major paths to becoming a machine learning engineer
Generally speaking, there are two main paths to becoming a machine learning engineer:
1. Data Science Path : First master math and data processing skills, then start applying machine learning, and finally learn the necessary software engineering skills.
2. Software Engineering Path : Become an excellent software engineer first, and then gradually learn mathematics, data processing, and machine learning skills.
For self-learners, I recommend the second path. The reason is that even if your data processing and machine learning skills are still in their infancy, you can create value for the company. Many business problems are not complicated, and a simple but deployed model can produce practical benefits. In contrast, a very good model that only exists in a Jupyter notebook is at best just an interesting toy . Of course, this does not mean that you can postpone learning mathematics indefinitely. Remember, continuous improvement can avoid stagnation.
If you are studying a quantitative related major (such as mathematics, statistics, etc.) in college, you may naturally follow the first path. In this case, it is recommended that you spend some time learning about software engineering during your studies or after graduation.
Ideally, if conditions allow, the best option is to take both paths at the same time: major in computer science with a focus on machine learning, while also learning industry-level collaborative development skills through a lot of internships. This approach will allow you to develop well-rounded and lay a solid foundation for your future career.
Practical learning methods
Below I will recommend a series of structured courses to help you embark on the career path of machine learning engineer. These recommendations are mainly to give you an understanding of relevant skills, rather than a strict curriculum. You can always adjust the learning plan according to your interests and needs, use your favorite resources, or learn skills directly through actual projects. After all, you know your learning style best. It is important to master the core content in the roadmap rather than rigidly follow a specific learning method.
Now that I promised to give you an actionable roadmap and concrete resources, let’s get started!
Learn the basics of programming
No matter which path you choose, the first step to becoming a machine learning engineer is to learn programming and computer basics. Considering that the machine learning and data science ecosystem is most mature in Python and there are a lot of learning resources, choosing Python as the language to start with is a good choice.
Here are some recommended learning resources:
1. CS50 course from Harvard University : This is an excellent introductory course in programming and software engineering that covers the basics of Python.
2. Programming Basics by the University of Helsinki : If you want to learn Python in more depth, this course is a good choice. If you have already learned CS50, you can skip the first few chapters.
3. Dead Simple Python : Although you don’t need to have a deep understanding of Python’s inner workings to use it for data science and machine learning, this knowledge will be very helpful in the future. I suggest you put this book by your bed and read a chapter every night before going to bed.
Remember, programming skills are the cornerstone of your future career development, and it is well worth taking the time to lay a solid foundation.
Learn basic machine learning
Now that you have mastered the basics of programming, it's time to start learning machine learning. It is recommended that you start with basic machine learning (shallow ML) algorithms. These algorithms are more intuitive than neural networks and can help you develop data processing skills without involving too much complexity.
Recommended resource: Andrew Ng’s Machine Learning Professional Course. This course has been an important entry point for many people into the field of artificial intelligence. The content is comprehensive and easy to understand.
Deep Learning
After mastering the basics of machine learning, you can move on to deep learning. Deep learning is the mainstream technology in the current industry and is also a very powerful toolbox. Here are some recommended learning resources:
1. If you like Andrew Ng’s teaching style, you can continue to study his Deep Learning Professional Course.
2. For those who prefer university-style courses, I recommend Yann LeCun’s Deep Learning Lectures at New York University.
3. If you prefer a more practical approach, you can try fast.ai's course and accompanying book "Practical Deep Learning for Coders".
These resources also cover some necessary math. If you find that your math foundation is not strong enough, you can consider taking the Deep Learning Math course provided by deeplearning.ai.
Build domain expertise
Once you have mastered the basics of deep learning, the next step is to choose a specific area to delve into. If you are still unsure of your interests, you can try a series of courses provided by Huggingface. Although these courses are not comprehensive, they can provide you with the foundation, background information, and professional vocabulary to help you read relevant research papers and inspire project ideas.
Remember, whether it is software development, programming or machine learning, theoretical knowledge is important, but engineering practice is more important. You need to learn and improve through hands-on practice. I believe you have completed a lot of course exercises and small projects during the learning process. Now it is time to challenge more difficult projects. Start to freely explore your areas of interest and grow from a novice to an expert by building a personal portfolio.
As Andrej Karpathy puts it:
How to become an expert in a field:
1. Take on specific projects step by step and complete them in depth, adopting a "learn as you go" approach (i.e. don't learn broadly from the bottom up)
2. Teach or summarize everything you learned in your own words
3. Compare yourself only to your past self, not to others
— Andrej Karpathy (@karpathy)
November 7, 2020
Generally speaking, one or two impressive, well-architected, and innovative large projects are more valuable than many basic projects. Not only will this allow you to learn more, but it will also make your resume stand out. To be a good job candidate, it is important to make these projects concrete and visible. You can share your learnings by writing a blog or tweeting. But the best way to show your strength is to build a front-end interface for your project so that others can experience your work firsthand.
Study Software Engineering
Next, let's talk about software engineering. The Fullstackopen course is a great starting point for learning web development and distributed systems. While it doesn't cover machine learning, it covers many tools and practices that are very valuable to machine learning engineers, such as architectural design for distributed systems, database management, and containerization. This knowledge is invaluable for deploying your models and providing interfaces to users. The course uses JavaScript because that's the main language for web development. While it may seem difficult at first, you've come a long way and it's worth biting the bullet and adding another language to your toolkit now.
Learn MLOps
Additionally, there are software engineering and development practices (MLOps) that are specific to machine learning. To learn how to manage and design machine learning products throughout their lifecycle, fullstackdeeplearning is a great resource to help you get a comprehensive understanding of these practices. Pick the ones that will make your life as a machine learning engineer easier and apply them to your projects. The effort will be well worth it.
That’s it. If you follow this guide, I believe you can become a competitive entry-level machine learning engineer. By studying the above material, you will have the necessary theoretical knowledge, and your project experience will make you an expert in some key areas.
However, it’s not enough to just have skills to get a job. You also need to demonstrate and express your skills. You can do this by participating in internships, getting good recommendations (or job offers), and showing your portfolio. Quincy Larson, the founder of FreeCodeCamp, has written a great book about his journey to becoming a software engineer. Although his target position is slightly different from a machine learning engineer, the experience is very instructive for your future path.
A word of caution: even though this roadmap looks simple, it is not easy to achieve. Learning machine learning and software engineering is certainly not easy, but it is not impossible. Others have done it, and if you set your mind to it, you can do it too.
Depending on your starting point, the time required to study is roughly as follows:
Learning from scratch
If you committed to this roadmap full time, I estimate it would take about 18 months to learn from scratch.
If you are at a stage in life where you can go to college and can afford it, I think it is the easiest path. College provides you with a community, mentorship, courses, internship support, and eases the worry of your parents or other concerned people about your future.
If you are switching careers from an unrelated industry, make sure to leverage your prior experience. Even if you want to leave your current industry, your domain knowledge is an advantage to you. Once you have a relevant position, you can learn on the job and it will be much easier to switch jobs.
Career transition for developers
If you're already a developer, you'll become valuable very quickly. Spend about six months after get off work learning shallow and deep machine learning and the math you lack. Your prior software engineering experience is invaluable and highly valued by employers. You may not even have to sacrifice any qualifications. You'll learn on the job once you switch roles.
Introduction to Machine Learning for Data Scientists
If you’re a data scientist, you may feel like your career has reached a ceiling due to your lack of software engineering skills. At least that’s true for me. For those in data science, moving into machine learning is more of a natural career progression. If you spend extra time learning, you can accelerate your career. Look for machine learning projects in your current role or curate content from the above resources, spend a few months learning, and then build a portfolio to apply for new positions.
Summarize
You can become a competitive entry-level candidate by following this roadmap:
1. Learn computer science fundamentals and Python programming through CS50 and a dedicated Python resource
2. Learn classical (shallow) machine learning to build a foundation and develop intuition for working with data.
3. Learn deep learning through specific courses, such as Yann LeCun’s NYU lectures, fast.ai or deeplearning.ai’s deep learning specialization courses
4. Learn MLOps from fullstackdeeplearning
-
If necessary, you can learn software engineering through fullstackopen, including the basics of web development, distributed systems, DevOps, and relational databases
5. Find a niche you want to work in and develop expertise by building a portfolio. You can find a starting point from the Hugginface course and build some interesting projects and paper realizations along the path that interests you.
Good luck!
Original article: https://www.maxmynter.com/pages/blog/become-mle
This article is translated by CSDN.
Pure sharing, copyright belongs to the author, if there is any problem, please contact the administrator to delete it.
|