Time is running out for the “end-side big model”
Source: Brain Pole
Author: Tibetan Fox
Edge AI models, that is, large models that only run locally on the device (such as smartphones, IoT devices, embedded systems, etc.), have been very popular in the past one or two years.
Specifically, terminal equipment manufacturers such as Apple, Honor, Xiaomi, OV, etc., and AI companies such as SenseTime have launched self-developed pure terminal-side large models.
The significance of the end-side big model is to “achieve great results with small efforts”.
In simple terms, compared with large cloud-side models, large models on the device side need to be deployed locally, so the parameter scale is not large, and there is no need to worry about private data being leaked during inference; no network transmission is required, so the response speed is faster; the device is natively equipped, and there is no need to rent cloud resources, which is more economical to use...
It sounds like a large model on the client side is an indispensable standard for AI devices. But the actual situation may be beyond many people's expectations.
When we researched and used the big models of multiple terminal manufacturers, we found that end-cloud collaboration and cloud-based big models are the mainstream forms of big models on the end side.
For example, the currently popular “erasing background figures in a mobile phone photo with one click” cannot be achieved by relying solely on the computing power of a large model on the end. It requires collaboration between the end and the cloud to complete.
Another example is official document writing, key point summaries of long articles, PDF key point summaries, etc., which are either impossible to complete with the big model on the end. For example, the big models on the end of Honor and OPPO do not support PDF text summaries, and the support and generation effect of Xiaomi MiLM are not good enough.
Ultimately, users still need to access the web pages/APPs of large cloud models such as GPT-4, Wenxin Yiyan, Zhipu Qingyan, iFlytek Spark, and KIMI to meet the requirements of some complex AIGC tasks.
It is not difficult to see that the end-side large model sounds beautiful, but it is a bit useless in actual use.
As big cloud models become "bigger" (moving towards unified multimodality) and "smaller" (compression technology), there is really not much time left for "big models on the edge".
The big model on the client side is not a panacea
But it is impossible without a large cloud-side model
At present, the gains and losses of the "end-side big model" are simply not balanced.
Let’s talk about the benefits first. The big cloud model is more valuable to users than the big end-side model.
The first thing that AI needs to do is to ensure user experience and value first before considering other things. The fact that it can only run locally on the device means that the "big model on the device" is destined to be small, which will inevitably limit the performance of the model itself and cannot be comparable to the big model on the cloud.
So, when users use the big model on the client side, they have to sacrifice some of their experience. So, do they get more benefits? Not really.
The capabilities of cloud-based large models are becoming increasingly powerful, creating a greater experience gap with the large models on the client side that have to "lose the big for the small". For example, the multimodal large models that OpenAI and Google have been competing fiercely for recently, GPT-4o and Gemini, bring amazing voice interaction and multi-modal generation capabilities, and process data and complex logic such as pictures, videos, and audio, all of which must run in the cloud.
A senior practitioner in the domestic PC industry once told BrainJiTi that after the big model came out, our hardware companies have been studying how to combine the big model with the PC. What is a real AI PC? The conclusion is that hardware equipped with GPT-4 (referring to the most advanced big model at present) can indeed be called "AI xx", and the model capability is the core.
Therefore, if you want to do a good job in edge-side AI, the big model on the edge is not a panacea, but without the big model on the cloud, it is absolutely impossible.
If we have to use the big cloud model, is it also necessary to use the big end-side model? This requires us to talk about the loss.
Not using a large model on the end will not cause more trouble to users.
Previously, the pursuit of "end-side large models" was mainly limited by two points: computing bottlenecks and security concerns. The real-time requirements of large model reasoning require higher latency in the cloud than in local. In addition, mobile phones and PCs involve a large amount of private data, which is transmitted to the cloud for reasoning, which makes many people worried. The above two "losses" are being actively addressed.
For example, at the Google I/O conference not long ago, the Gemini 1.5 Flash, a lightweight model with fast response speed and low cost, was released. Google adopted the "distillation" method to transfer the core knowledge and skills of the larger Gemini model to a smaller and more efficient model. Gemini 1.5 Flash has good performance in a variety of tasks such as summarization, chat applications, image and video subtitles, and can run on different platforms.
In addition, local computing hardware can be optimized for AI tasks, which can also improve the fluency of cloud inference services. Currently, the x86 and Arm camps are actively improving the adaptability of edge computing units to AI-specific tasks, and flagship and high-end mobile phones already support real-time operation of large models with large parameter volumes.
In terms of data security, terminal manufacturers and large model companies have launched corresponding privacy and security protection mechanisms to prevent the risk of leakage through various means such as "data available but not visible", desensitizing, and federated learning.
Take Apple, which has always attached great importance to privacy and security, as an example. It has also developed its own end-side model OpenELM, which can run on devices such as mobile phones and laptops. However, when it actually launches capabilities such as AIGC, it is said that it will also choose to cooperate with large model companies (rumored to be OpenAI abroad and Wenxin Yiyan in China).
In summary, the benefits of using "big cloud models" are increasing significantly, while the losses of not using "big end-to-end models" are getting smaller and smaller. This makes it increasingly uneconomical to use big end-to-end models.
The next story is not difficult to predict. As more and more terminal companies stuff cloud-based big models into their devices, the existence of pure end-side big models will become more and more embarrassing, entering a cycle of "not easy to use - not liked to use - even more difficult to use".
This side large model
Is it necessary for terminal manufacturers to do this?
You may ask, since the terminal-side big model is not as easy to use as the cloud-side big model, why are terminal manufacturers still putting so much effort into it?
The objective situation is that large models are necessary, but terminal manufacturers are not suitable for making large cloud models, so end-side and end-cloud collaboration has become a must.
The head of a domestic terminal company once said bluntly: Even if my R&D expenses doubled, I would not be able to make a general large model like ChatGPT and Sora. I would still choose to cooperate with partners such as Baidu, Tencent, and Alibaba.
For example, Honor is guiding the development of hundreds of models and connecting general big models such as Wenxin Yiyan and industry big models such as AutoNavi Maps and Air Travel in mobile phones; Huawei has connected general big models such as Wenxin Yiyan, iFlytek Spark, and Zhipu AI in PCs, and launched the AI minutes function based on the self-developed Pangu big model...
From a subjective perspective, terminal manufacturers make end-side big models for both brand considerations, to demonstrate their self-developed big model technology capabilities, and also for the consideration of "holding the soul in their own hands", similar to banks, financial institutions, and car companies wanting to hold the core advantage of data in their own hands and build industry big models rather than handing them over to big model manufacturers.
Terminal manufacturers hope to enhance the experience advantages of AI devices and increase the attractiveness of their products to consumers through cooperation on cloud-based big models, and they also hope to seize the end-side big models through self-research and protect the data moat. This is a big model strategy that can be used for both offense and defense.
We predict that as the performance and capabilities of cloud-based big models grow nonlinearly, the gap between terminal manufacturers' pure end-side big models will widen and they will no longer be the basis for consumers' purchasing decisions.
In the near future, whether or not a high-quality cloud-based large-model ecosystem can be integrated on the device side will become a key point of competition for AI terminal devices.
To sum up, terminal manufacturers can make the big model on the end, but it is not necessary. Terminal manufacturers must have the big model on the cloud, and they must be better than others.
Deep collaboration is not just about big models
There are two types of manufacturers
In a communication with Huawei Terminal, the other party mentioned: Huawei is the only terminal manufacturer that fully develops the cloud-side universal large model and the terminal-side large model (referring to the Pangu large model), which lays a good foundation for AI hardware. For example, to complete a complex AIGC task, it can be split into parallel training of cloud, terminal, and edge, taking into account the inference effect, running speed, data security, etc.
It should be noted that the above ideas are still in the proof-of-concept stage, and we have not yet experienced the deep integration of Pangu big models from cloud to end on Huawei terminal devices. But this concept does make sense logically - through efficient collaboration between end and cloud, we can build a big model product with no shortcomings and impress potential consumers of AI hardware, and this is inseparable from the deep cooperation between terminal manufacturers and general big model manufacturers.
Terminal companies that fully own both the end-side big model and the cloud-side universal big model do have the inherent advantage of close integration. However, other manufacturers can also supplement this through an open ecosystem to form a more comprehensive model ecosystem.
This is a mutually beneficial and win-win situation for both parties:
General large model manufacturers need to rely on the huge equipment ecosystem of terminal manufacturers as the soil for the implementation of large models and recover the huge investment in base models. With the help of device data on the terminal side, the illusion problem of large models can be better solved and the model evolution can be promoted.
Terminal manufacturers need general large models (especially cutting-edge cloud-based large models) as experience support to provide users with the most advanced AIGC applications and experiences, avoid investing too much R&D costs in base models, and avoid being left behind by other terminal manufacturers in AI experience.
On this basis, terminal manufacturers and cloud-based general-purpose large-scale model manufacturers must urgently address the following key issues:
Security issues: How to learn device data, clarify data rights and responsibilities, and establish a reasonable distribution mechanism for the commercial benefits generated by data while ensuring privacy.
Developer profit-sharing policy. Whether it is AI applications for mobile phones or AI applications for large models in the cloud, they all need developers to complete them. The further connection between the terminal developer ecosystem and the large model developer ecosystem will also increase the attractiveness to developers and accelerate the incubation of AI applications. Then, how to jointly empower and share profits with developers will become the key to the cooperation and game between the two types of manufacturers.
In the first half of this year, we have witnessed many breakthroughs in general big models. There is not much time left for end-side big models, and the window of opportunity for terminal manufacturers to build a big model community ecosystem will not be too long.
In the second half of the year, we may witness a "camp war" in which "terminal manufacturers + large model manufacturers" join hands.
This article is reprinted from "Neopolarity". The content is the author's independent opinion and does not represent the position of Global IoT Observation. It is for communication and learning purposes only. If you have any questions, please contact us at info@gsi24.com .
The sensor market worth over 370 billion is waiting to be divided!
AI is approaching ByteDance’s heartland, has Zhang Yiming lost the opportunity?
Talking about third-generation semiconductors, why go to Shanghai?
Losses narrowed significantly, and Xiaopeng began to make money by "selling technology"
Click " Watching " to cheer for the Internet of Things industry!