ChatGPT only counts the L1 stage, Google proposes a complete AGI roadmap

Latest update time：2023-11-08

Reads：

Fengse comes from Ao Fei Temple
Qubit | Official account QbitAI

How should AGI develop and what will it ultimately look like?

Now, the industry's first standard is released:

AGI grading framework, from Google DeepMind.

This framework believes that the development of AGI must follow 6 basic principles :

Focus on capabilities, not process
Measures skill level and versatility simultaneously
Focus on cognitive and metacognitive tasks
Focus on highest potential, not actual performance
Focus on ecological effectiveness
Focus on the development of the entire AGI road, not a single end point

Based on this principle, AGI will present 6 major development stages , each stage has corresponding depth (performance) and breadth (versatility) indicators.

At what stage have our current AI products reached? There are answers here too.

Take a look at the details.

6 basic principles

What is AGI?

Regarding this issue, many scientists and research institutions have given their own understandings.

For example, the Turing test proposed by Turing believes that whether a machine can "think" is a measurement indicator; the proposer of the concept of strong artificial intelligence believes that AGI is a system with consciousness ; some people say that AGI must be able to improve complexity and speed. As good as or even better than the human brain ...

Google believes that these definitions are not comprehensive.

Like the Turing test, some LLMs can already pass it, but can we call those models AGI?

Like the human brain theory, the success of the Transformer architecture has shown that strictly brain-based thinking processes are not necessary for AGI.

By analyzing the advantages and disadvantages of these definitions (9 in total, please read the original text for details) , Google reorganized 6 basic principles:

1. Focus on ability, not process.

This can help us remove some requirements that are not necessarily necessary to implement AGI:

For example, AGI does not necessarily have to think or understand in a human-like way, nor does it mean that the system must have abilities such as subjective consciousness (mainly because this ability cannot be measured by fixed methods) .

2. Pay attention to versatility and skill level.

All current AGI definitions emphasize versatility, which goes without saying. But Google emphasizes that performance is also a key component of AGI (that is, it can reach several levels of human performance) . In the subsequent formulation of specific stages, classification will mainly be based on these two indicators.

3. Focus on cognitive and metacognitive tasks.

The former is currently basically the consensus, that is, AGI can perform various non-manual tasks. However, Google emphasizes here that the ability of AI systems to perform physical tasks also needs to be strengthened because it promotes cognitive abilities.

Furthermore, metacognitive abilities, such as learning new tasks or knowing when to ask humans for help, are key prerequisites for systems to move toward generality.

4. Focus on the highest potential, rather than the actual implementation level

Proving that a system can complete a task at a given standard is enough to declare the system AGI. We do not require that a system of the same level be fully deployed in the open world.

Because this may face some non-technical obstacles, such as legal and social considerations, and potential ethical issues.

5. Pay attention to ecological effectiveness.

By ecological validity, Google refers to the selection of truly useful real-life tasks to benchmark the progress of the system. These tasks include not only economic value but also social and artistic value, and avoid traditional AI indicators that are easy to automatically match and quantify.

6. Pay attention to the development of the entire AGI road, rather than a single end point.

This is why Google has developed the 6 stages of development we will see next.

6 necessary stages

The 6 stages of the road to AGI are divided by depth indicators (i.e. skill level, compared to humans) and breadth indicators (versatility) .

Stage zero is "No AI". Computing software, compilers, etc. belong to this category and can only perform human-in-the-loop tasks in terms of versatility.

The first stage is "Emerging" , where skills are equivalent to or slightly better than humans without relevant skills.

Large models such as ChatGPT, Bard and Llama 2 belong to this stage and have already met the versatility to be achieved at this stage.

The second stage can be understood as the "competent level" , which can reach 50% of the level of normal adults.

Large models such as voice assistant Sir and which can reach SOTA level in tasks such as short article writing/simple coding all belong to this stage.

However, they are only qualified in terms of technical indicators, and their versatility is not enough. There are no other AI products that can reach this level of versatility.

The third stage is "Expert" , which can reach 90% of the level of normal adults.

Google believes that spelling and grammar checkers such as Grammarly and image generation model Imagen can be classified as this stage, mainly because they have reached the standard in terms of skill level and are not versatile enough.

The fourth stage is "Master Level" (Virtuoso) , which can reach 99% of the level of normal humans.

Deep Blue, AlphaGo, etc. all belong to it. Likewise, no AI product has yet achieved this level of general capabilities.

The last stage is "Superhuman" . AlphaFold and AlphaZero, which can already surpass top scientists in terms of skill indicators, can also be classified into this stage.

There is no doubt that AI with superhuman intelligence has not yet been born.

From this we can see that according to Google's standards, most existing AI products have actually entered different AGI stages, but only at the skill level - when it comes to versatility, currently only models such as ChatGPT are fully qualified.

But they are only still in the lowest "level 1 AGI" stage.

However, as Principle 2 says, evaluating AGI depends on the two indicators of skill level and versatility, so this division is reasonable.

It is worth mentioning that we can see that image generation models like DALLE-2 can already be classified as "three-level AGI" .

The reason given by Google is that the images it generates are already better than most people (that is, better than 90% of humans) .

This division does not take into account the fact that most users are unable to achieve optimal performance due to poor prompting skills.

Because following principle 4, we only need to pay attention to the potential of a system.

In addition, for the final stage of AGI, Google imagines that in addition to protein structure prediction, it may also be able to communicate with animals, analyze brain signals, make high-quality predictions and other tasks that are difficult for humans to achieve, so as to not waste our expectations. .

Finally, regarding this level division, Google also admitted that there is still a lot to do:

For example, what standard task sets should be used to measure the universality dimension? What percentage of the tasks need to be completed? What tasks must be met?

It is unlikely that all of these issues will be clarified at once.

Do you agree with these principles and phases proposed by Google?

Original text:
https://arxiv.org/abs/2311.02462

-over-

"Qubit 2023 Artificial Intelligence Annual Selection" has begun!

This year, the Qubit 2023 Artificial Intelligence Annual Selection has established 5 categories of awards from the three dimensions of enterprises, people, and products/solutions! Welcome to scan the QR code to register

MEET 2024 conference has started! Click here to learn more .