The father of reinforcement learning joins AGI to start a business! Teaming up with the legendary programmer Carmack, we don’t rely on large models

Latest update time：2023-10-11 18:24

Reads：

Mengchen comes from Aofei Temple
Qubits | Public account QbitAI

Legendary programmer John Carmack joined forces with Richard Sutton , the father of reinforcement learning , All in AGI .

2030 Demonstrate to the public that the goal of general artificial intelligence is feasible.

And unlike mainstream methods, it does not rely on large model paradigms and pursues real-time online learning.

The pair made the announcement at a special event at the Machine Intelligence Institute (Amii) at the University of Alberta, where Sutton teaches .

Sutton will join Carmack's AI startup Keen Technologies while maintaining a faculty position in Alberta.

Both men admitted at the event that Keen Technologies' team is small compared to larger companies with hundreds or thousands of employees.

It’s still in its infancy, and the company’s entire technical team is on site——

There were only these 4 people standing .

Its financing scale is US$20 million, which is incomparable with OpenAI and Anthropic, which often raise billions.

But they believe that the final source code of AGI will be of the order that one person can write, and may only have tens of thousands of lines.

Moreover, the AI field is currently at a special moment when the leverage effect is greatest , and small teams also have the opportunity to make big contributions.

Legendary programmer and father of reinforcement learning

Carmack's legendary experience, from developing the world's first 3D game, to transitioning to building rockets, to joining Oculus and becoming a key figure in the subsequent Meta VR, is well known.

Later, he became involved with AI and was also related to OpenAI.

He once revealed in another interview that Sam Altman had invited him to join OpenAI , thinking that he could play an important role in system optimization.

But Carmack believed that he did not have any understanding of modern AI in the machine learning paradigm at the time, so he did not agree.

This became an opportunity for him to start understanding AI.

He asked Ilya Sutskever, the chief scientist of OpenAI, for a list of must-reads for getting started, and started self-study from scratch, first gaining a basic understanding of traditional machine learning algorithms.

When he had some free time and planned to continue getting involved in deep learning, he came up with a one-week programming challenge :

Print a few of LeCun's classic papers and practice them when the network is disconnected, starting with the backpropagation formula.

After a week passed, he ended the retreat with a convolutional neural network hand-made in C++, without the help of modern deep learning frameworks in Python.

All I can say is that I admire the great master.

At this time, his main business was still researching VR at Oculus, a subsidiary of Facebook (later renamed Meta) , and led the team to launch products such as Ouclus Go and Quest.

However, during this process, conflicts and disagreements gradually arose between him and the company's management. He believed that the company's internal efficiency was low, and he also publicly expressed his dissatisfaction.

In 2019, he resigned from his position as Oculus CTO and became an "advisory CTO", and began to shift more energy to AI.

In August 2022, he announced that the new AI startup Keen Technologies announced a financing of US$20 million. Investors include Sequoia Capital, former GitHub CEO Nat Friedman, and others.

Later, he also revealed that he could actually get it with a mere US$20 million .

But taking money from others can give him a sense of crisis and urgency, and a stronger determination to get things done.

At the end of 2022, he officially left Meta and regarded VR as a stage of life that has passed, and then turned completely to AI.

In addition to this obvious main line, Carmack and AI also have some inexplicable fate.

His 3D games at that time stimulated the demand for graphics computing, and GPUs began to develop and expand in the gaming field.

Today, it is the computing power of GPU that supports the explosion of AI . He is still proud of his contribution when talking about this.

…

Today’s other protagonist, Sutton , is also a legend.

He is known as the father of reinforcement learning and has made important contributions to methods such as reinforcement time difference learning and policy gradient. He is also the co-author of the standard textbook on reinforcement learning.

In 2017, he joined DeepMind as an outstanding scientist and participated in the AlphaGo series of research. His student David Silver is one of the main leaders of AlphaGo.

Sutton wrote a famous short article The Bitter Lesson , arguing that trying to teach human experience to AI is not feasible. All breakthroughs so far have relied on improvements in computing power, and continuing to utilize the scale effect of computing power is the right path.

Before the two formally communicated, Carmack had expressed his concern and approval for this article.

But the real direct communication between the two was made by Sutton.

A few months ago, after Carmack announced funding for AGI Ventures, he received an email from Sutton.

Sutton wanted to ask him whether his research path should be purely academic, commercial, or non-profit.

However, in subsequent email exchanges, the two discovered that there was a surprising consistency in AI research directions and concepts, and gradually established a cooperative relationship.

Specifically, the two reached four consensuses :

They all believe that the current development of AGI is limited to a few very narrow directions, relying too much on big data and big computing power and neglecting innovation.
They all believe that too early commercialization will hinder the development of AGI.
They all believe that the final AGI will not be too complicated, and one person can master all the principles, and even one person can write the main code.
All believe that the emergence of an AGI prototype in 2030 is a feasible goal.

Not only rely on large models, small teams also have opportunities

A very bold goal, and the audience thought so too.

Faced with the question "How can a small team achieve such an ambitious goal?", Carmack believed that the amount of data and computing power required to achieve AGI may not be as large as imagined .

Capture what humans see through the eyes for an entire year into a 30 frames per second video that can be stored on a thumb-sized USB flash drive.

A 1-year-old child only has so much experience data and has already shown obvious intelligence.

If the algorithm is right, there is no need to use the entire Internet's data for AGI to learn.

Regarding the demand for computing power, he also uses this kind of intuitive thinking to consider: the computing power of the human brain is also limited, far from reaching the level of a large computing power cluster.

It is larger than a server node (node) and larger than a cabinet (rack) , but the maximum is only one order of magnitude higher.

And as time goes by, the algorithm will become more efficient and the computing power required will continue to decrease.

If Carmack's work in 3D games, rockets, and VR, these seemingly unrelated areas of work, have something in common, it is the optimization of large-scale real-time feedback systems .

This is what Sam Altman was looking for when he invited him to join OpenAI.

The AGI architecture he envisioned should be modular and distributed , rather than a huge centralized model.

Learning should also be continuous online learning , instead of the current pre-training where most parameters are no longer updated.

My bottom line is that if a system can't run at 30hz, which updates every 33 milliseconds or so during training, I won't use it.

He further said that as a low-level system programmer who can write original Cuda code and manage network communication by himself, he may be able to do some work that others will not consider at all.

It is not even limited to the existing deep learning framework, but will try more efficient network architecture and computing methods.

The overall goal is to simulate a virtual agent with intrinsic motivation and continuous learning ability to continuously learn in a virtual environment.

Not robots, because his experience building rockets taught him that the fewer physical objects he has to deal with, the better .

Compared with Carmack, who just got involved in AGI not long ago, Sutton has spent decades on this problem and has a more specific research plan.

Although not much was said at this event, the main part has been written in an arXiv paper in the form of the "Alberta Project" .

Project Alberta proposes a unified agent framework that emphasizes universal experience rather than specific training sets, focuses on temporal consistency, prioritizes methods that can produce scale effects with computing power, and multi-agent interaction.

A 12-step roadmap was also proposed .

The first six steps focus on designing a model-free continuous learning method, and the last six steps introduce environmental models and planning.

The last step is called Intelligence Amplification . One agent can use the knowledge it has learned to amplify and enhance the actions, perception and cognition of another agent according to some general principles.

Sutton sees this kind of enhancement as an important part of realizing the full potential of artificial intelligence.

In this process, it is very important but also very difficult to determine the indicators for evaluating AI progress, and the team is exploring different developments.

In addition, Carmack has always been an advocate of open source, but on the issue of AGI, he said that he would maintain a certain degree of openness, but would not disclose all algorithm details .

As a small team, Carmack believes that we need to maintain a pioneering spirit and focus on long-term development rather than short-term interests.

Commercialization will not be considered prematurely, and there is no intermediate form that can be publicly released like ChatGPT .

Regarding what can be achieved in 2030, Carmack believes that "there is AGI that can be demonstrated to the public" , and Sutton's statement is that "AI prototypes can show signs of life . "

2030 becomes a key node

This is not the first time that 2030 and AGI appear at the same time.

Top AI teams unanimously regard around 2030 as the key node to achieve AGI.

For example, OpenAI, in its announcement of setting up a super-intelligence alignment department with 20% of its total computing power, stated that we believe super-intelligence will arrive in this decade .

Even the investment community has similar views. Masayoshi Son just presented such a PPT at the SoftBank World Corporate Conference.

Apart from OpenAI and Keen Technologies, there are not many organizations working on developing AGI.

OpenAI’s biggest competitor, Anthropic , which just raised $4 billion in financing , its CEO Dario Amodei mentioned in a recent interview that AI can behave like a well-educated human within two to three years.

When Transformer authors Vaswani and Palmer left Google, they founded AdeptAI with the goal of building general intelligence.

However, the two suddenly left the company earlier this year, leaving only one co-founder, David Luan (far right) .

The two Transformer authors also founded Essential AI. This company's vision is less "looking up at the stars" and is a more pragmatic commercialization of large models.

There are also not many domestic players who have clearly stated the goals of AGI. The main ones are MiniMax and the newly founded Dark Side of the Moon by Yang Zhilin .

Reference links:
[1] https://www.amii.ca/latest-from-amii/john-carmack-and-rich-sutton-agi/
[2] https://www.youtube.com/watch?v =uTMtGT1RjlY
[3] https://arxiv.org/abs/2208.11173

-over-

The most “in” big model | Column article

How to compress hundreds of millions of parameters with one click? How to balance the performance and security of large models? How can one line of code optimize the model and accelerate it? Here are all the ways to play with large models, and we will teach you step by step how to speed up large model reasoning!

Click on the image below to jump to the column article page.