Large model prediction, why does the next token have to be text?

Latest update time：2024-03-29

Reads：

Mingmin Jin Lei comes from Ao Fei Temple
Qubit | Public account QbitAI

Too fast, too fast...

The large model generation skills have reached a level that ordinary people cannot understand!

It can generate physical examination reports for the first, second, and third years in the future based on the user's physical examination reports in the past five years.

You see, this generation process is very similar to ChatGPT, predicting the next word based on historical words.

It can view the operation status of unit subcomponents in the past 7 days and generate hourly subcomponent reports for the next 3 days.

It can also generate hourly precipitation analysis reports from the 1st, 2nd... to the 7th day in the future based on historical hydrological data and meteorological data for the next 7 days, including detailed precipitation amount and precipitation distribution.

Nowadays, the content generated by large models is no longer just text/image/video .

The analysis of these reports generated above involves a lot of professional knowledge, and it is difficult for ordinary people to evaluate their rationality and correctness based on their own knowledge reserves.

The most I can comment on is: I don’t know how powerful it is!

how to say? " AI seems to be generating everything ."

LLM+ industry data, are you on the wrong track?

A simple understanding of large models is Predict the Next “X” . ChatGPT is Predict the Next “Word”.

But what the industry needs is often not predicting the next word.

For example, health management planning for patients with chronic diseases requires data prediction from a medical perspective based on a series of physiological indicator data. To give an inappropriate example, this is more like solving a problem using mathematical methods.

If a large amount of professional medical corpus is fed based on a large language model, it will be more like using a Chinese method to read questions. Although the relevant terms and indicators can be understood, the prediction results given are likely to be inaccurate. Because the problem itself is beyond the scope of "language", it cannot be solved using Chinese methods .

If the mode of "X" changes from "Word" to "physical examination report", the model can predict the next physical examination report based on historical physical examination report data. This is a large health management model.

Its logic is more like " if you sow melons, you will reap melons, and if you sow beans, you will reap beans ." That is, input "X" and output "X".

The "X" here may include different types of professional data such as hydrological data, health reports, equipment monitoring values, design deductions, etc.

Based on the geometric model and room data of the concert hall, it can emit 5000Hz frequency rays from the sound source, generate a ray distribution map, and find the best sound source placement position for hearing.

How to predict "X"?

So, how to build these large industry models that can predict the next X?

With the just released Oracle AIOS 5.0 . Its core feature is to build a large industry base model based on X-modal data of various industry scenarios.

It solves the problem that current large industry models can only feed industry text data to large language models and generate the next word, allowing large models to enter a wider range of fields.

Prophet is the core product of AI company 4Paradigm. In 2015, Seer AIOS version 1.0 was first released, improving model accuracy through a high-dimensional, real-time, self-learning framework; in 2017, Seer AIOS version 2.0 used the automatic modeling tool HyperCycle to lower the model development threshold; Seer AIOS version 3.0 released in 2020 Standardize AI data management and launch production; in 2022, Prophet AIOS version 4.0 will introduce Polaris indicators to maximize the value of AI applications.

The AIOS 5.0 version puts forward a new idea for the industry's large model from the perspective of the generative AI+ industry.

In the first year of the recognized large-scale model application, the development and influence of large-scale models in the industry must have been several times greater than before. This more large-scale trend has also formed the next paradigm of the AIGC trend.

One More Thing: AIGC moves towards a new paradigm?

From pictures, text, videos, to health, water conservancy... It is not difficult to see that AIGC is now rushing towards the direction of AI -generated everything at a rapid speed .

Generally speaking, the development of everything seems to require some paradigms to promote, and it is not that new paradigms replace old paradigms, but that they complement each other to make them more in-depth and comprehensive.

Just like the four paradigms in scientific research, namely experimental induction, theoretical deduction, computer simulation and data-intensive scientific discovery, they complement each other and jointly promote the progress of scientific research.

So if we look at AIGC with this logic, it seems that four similar paradigms have begun to emerge.

The first paradigm of AIGC takes text generation as its core and demonstrates AI's ability to understand and generate natural language through applications such as intelligent customer service and content continuation. The AIGC technology at this stage laid the foundation for subsequent development, enabling machines to effectively communicate and interact with humans.

The second paradigm of AIGC extends the application field to image generation .

Such as generative adversarial networks (GAN), variational autoencoders (VAE), etc., which can learn mappings to generate realistic images from random noise. And the output results can be used in fields such as artistic creation, image enhancement, and virtual scene generation. This paradigm further demonstrates the imagination of AI.

The third paradigm of AIGC focuses on video generation , such as Gen2 and Sora.

Video generation reflects AI’s understanding of the world to a certain extent. Can the world be understood since Sora was born? Whether it is a world simulator has been debated. Because if it is determined that Sora can understand the world, it will mean that the door to AGI is officially opened.

The fourth paradigm of AIGC is to focus on industries , and technology will fully penetrate into various industries.

The core task of this stage is to deeply integrate AI technology with industry knowledge. This year is the first year for the implementation of large-scale model applications. We have seen that AIGC technology has begun to play an important role in key fields such as medical care, education, and finance.

How can we specifically promote AIGC to enter the industry more quickly? Players from all walks of life are still trying. Based on a large language model? Or directly train large industry models? Different routes have their own underlying logic. It is too early to say which route will be more successful.

But what is certain is -

In the process of AI generating everything, those individuals and industries that can take the lead in utilizing AI technology will be able to enjoy the dividends brought by the technology earlier. They will have the opportunity to lead changes in the industry and shape the future social and economic landscape.

And only when AIGC enters the fourth paradigm does it mean that the flywheel transformation from technological innovation to commercial entrepreneurship has been completed, and that generative AI has ushered in a new qualitative transformation in productivity .

-over-

Click here ???? Follow me and remember to star~