It’s time to become an ML engineer
AI has recently crossed a utility threshold, where cutting-edge models such as GPT-3, Codex, and DALL-E 2 are actually useful and can perform tasks computers cannot do any other way. The act of producing these models is an exploration of a new frontier, with the discovery of unknown capabilities, scientific progress, and incredible product applications as the rewards. And perhaps most exciting for me personally, because the field is fundamentally about creating and studying software systems, great engineers are able to contribute at the same level as great researchers to future progress.
“A self-learning AI system.” by DALL-E 2.
I first got into software engineering because I wanted to build large-scale systems that could have a direct impact on people’s lives. I attended a math research summer program shortly after I started programming, and my favorite result of the summer was a scheduling app I built for people to book time with the professor. Specifying every detail of how a program should work is hard, and I’d always dreamed of one day putting my effort into hypothetical AI systems that could figure out the details for me. But after taking one look at the state of the art in AI in 2008, I knew it wasn’t going to work any time soon and instead started building infrastructure and product for web startups.
DALL-E 2’s rendition of “The two great pillars of the house of artificial intelligence” (which according to my co-founder Ilya Sutskever are great engineering, and great science using this engineering)
It’s now almost 15 years later, and the vision of systems which can learn their own solutions to problems is becoming incrementally more real. And perhaps most exciting is the underlying mechanism by which it’s advancing — at OpenAI, and the field generally, precision execution on large-scale models is a force multiplier on AI progress, and we need more people with strong software skills who can deliver these systems. This is because we are building AI models out of unprecedented amounts of compute; these models in turn have unprecedented capabilities, we can discover new phenomena and explore the limits of what these models can and cannot do, and then we use all these learnings to build the next model.
“Harnessing the most compute in the known universe” by DALL-E 2
Harnessing this compute requires deep software skills and the right kind of machine learning knowledge. We need to coordinate lots of computers, build software frameworks that allow for hyperoptimization in some cases and flexibility in others, serve these models to customers really fast (which is what I worked on in 2020), and make it possible for a small team to manage a massive system (which is what I work on now). Engineers with no ML background can contribute from the day they join, and the more ML they pick up the more impact they have. The OpenAI environment makes it relatively easy to absorb the ML skills, and indeed, many of OpenAI’s best engineers transferred from other fields.
All that being said, AI is not for every software engineer. I’ve seen about a 50-50 success rate of engineers entering this field. The most important determiner is a specific flavor of technical humility. Many dearly-held intuitions from other domains will not apply to ML. The engineers who make the leap successfully are happy to be wrong (since it means they learned something), aren’t afraid not to know something, and don’t push solutions that others resist until they’ve gathered enough intuition to know for sure that it matches the domain.
“A beaver who has humbly recently become a machine learning engineer” by DALL-E 2
I believe that AI research is today by far the most impactful place for engineers who want to build useful systems to be working, and I expect this statement to become only more true as progress continues. If you’d like to work on creating the next generation of AI models, email me (email@example.com) with any evidence of exceptional accomplishment in software engineering.