Article count:10350 Read by:146647018

Account Entry

Huawei Smart Search is even better! Online smart picture search, you can find "my" pictures using human words

Latest update time:2023-07-03
    Reads:
Mingmin Xiao Xiao comes from Ao Fei Temple
Qubit | Public account QbitAI

Search engines ’ signal for change is stronger than ever.

First, Google launched AI snapshot, so that the search results are no longer just "10 blue links", but also add AI intelligent summary; soon Baidu also embedded "AI partner" into the search engine, so that you can get answers to questions through dialogue.

However, these are only for external searches.

For mobile, " internal search " needs are also changing simultaneously.

Internal search, compared to searching for external world knowledge, is a search technology that treats users' personal information as a huge knowledge base.

Unlike Google and Baidu, this search engine is more like the user's "second brain" and can accurately find the personal information required from photos taken and files downloaded.

But no matter what kind of search, the requirements for intelligence are very high, and it even needs to be implemented with the help of a large model with hundreds of billions of parameters.

Take finding a photo from your phone as an example. In the past, our operating habit may be to flip through the photo album for 10 minutes and find the one we want from hundreds of emoticons or thousands of photos (or even not find it).

But on the Huawei P60 series and Mate X3 mobile phones equipped with HarmonyOS 3.1, now you only need to use natural language to describe the photo features you are looking for, and the system can efficiently identify and provide relevant images.

Not only can you understand the overall semantics, such as typing "watch the sunrise on the top of the mountain" in the gallery——

Even more detailed descriptions of time and place can be quickly obtained. For example, voice prompts Xiaoyi to search for "photos of skiing in Changbai Mountain last year":

The most important thing is that this search method, which is comparable to the semantic understanding of large models, can be implemented directly on the device side without uploading data to the cloud for processing.

In other words, this internal search can still be performed even if the phone is in airplane mode .

So, what are the characteristics of this new image search function, and what has Huawei, the first to deploy it on the device side, done?

What does Huawei Smart Image Search look like?

Previously, there were two main ways to search for images on mobile phones .

A kind of file search that is equivalent to "skin-changing". The user not only has to accurately recall the specific time, shooting address and other information, but even the file name is accurate:

The other relies on the image recognition function of classification AI, but this kind of image search can only narrow the search scope through certain scene keyword descriptions, such as scenery, food, portraits, etc.

Obviously, both methods are still in the "information matching" stage, and the number of supported tags is also limited. Once they cannot be found, they will eventually have to go back to the manual photo flipping process.

This is because we are not only accustomed to using natural language to describe the content of pictures, but the content described is not limited to a noun , but may also be verbs, scenes, pronouns, etc.

In order to search for personal pictures "in memory", the AI ​​model must not only understand human speech, but also extract fine-grained labels from human speech and map them to pictures.

Now, Huawei's latest smart image search has realized these two functions very well.

In addition to directly searching for nouns to find pictures, you can also use any short descriptors to describe images. For example, if you pull down the desktop and enter "running" in the smart search, the system will automatically search for various running figures in the album and quickly give recommendations. :

If you feel that the search range is still too large, you can add information tags at any time , such as "running" becomes "running puppy", and you can find the image you want immediately:

Of course, you can add more than just one or two tags. If you want, you can flexibly refine the description , such as adding various compound tags such as time, place, person, semantics, etc., such as "I went to Inner Mongolia with my girlfriend to take photos of all kinds of delicious food last winter", etc.

After experiencing Huawei's smart image search, the most intuitive feeling should be " AI understanding " and " response speed ".

Compared with traditional file search or AI image recognition methods, Huawei Smart Image Search mainly achieves two major "leapfrogs" in image search functions:

  • First, the ability to interpret “human words” . Traditional image AI is often classified by summary words such as "time" and "place", while smart image search can not only search for separate words and classify them, but can even search them together, such as "tiger photographed at the zoo last year", etc.

  • Second, efficient search speed . Compared with flipping through photo albums for more than ten minutes and half an hour, now whether you pull down smart search from the desktop, open the gallery, or use Xiaoyi voice, you can search for the pictures you want with just one sentence, which has been improved at the system level. improve the efficiency of finding information.

Although it sounds like a small breakthrough in the search function of mobile phones and other mobile terminals, before Huawei, no manufacturer on the device side could solve this problem.

What technology is so difficult to implement?

What technical difficulties have been overcome?

In fact, neither the semantic understanding ability of large models nor the response speed of search engines can be tolerated by terminals with extremely limited computing resources .

Therefore, in the past, the only way for most search engines and large model-related APPs to solve the problem of "going online to the mobile terminal" was to allocate the model calculations to the cloud to solve the problem of insufficient resources.

But this inevitably means that data processing must be performed in the cloud .

Looking at the technical details, there are three major difficulties:

First , compress multi-modal large models and ensure accuracy . This is not achieved by simply using methods such as pruning or quantization, but simply by compressing the model size several times. After all, for the terminal side, when the computing power is limited, the size of the model that can be deployed is often only a few tenths of that of the large model.

Second , the power consumption required for search gradually increases with the increase of data. For on-device search engines, in the face of constantly updated data such as photos and files, the index can only be rewritten, which will inevitably lead to a lot of new computing expenses.

Third , there are cloud collaboration issues faced by model updates and other issues. Although the AI ​​model is ultimately deployed on the device side, whether it is model effect iteration, update, or training, it still has to be performed in the cloud and finally distributed to the device side. This will inevitably require manufacturers to have both cloud and cloud technologies.

Therefore, for internal searches that are extremely sensitive to data privacy, it is very difficult to deploy these two types of technologies on the end side. The previous "compromise" method was, at best, to deploy a "small model" such as image classification AI on the end side to implement simple intelligent image search.

So, how does Huawei solve these difficulties while retaining the "human speech understanding" effect and search response speed of large models to the greatest extent?

To put it simply, Huawei has developed its own corresponding technologies in terms of AI models and search engines.

On the one hand, Huawei has developed a lightweight multi-modal semantic understanding model specifically for the client side, which can reduce large models dozens of times without losing accuracy.

First, a multi-modal semantic representation model is used to convert different modal inputs into semantic vectors, combined with a multi-modal semantic alignment algorithm model to align the semantic information of texts and images, and Huawei's internal massive high-quality data to improve the recall rate.

Then, relying on lightweight deployment technology, high-precision retrieval is achieved on the client side, while ensuring that the data remains local to improve privacy and security protection.

On the other hand, Huawei has successfully "stuffed" the search engine into the mobile terminal by using methods such as index segmentation and regular compression and merging.

The core difficulty in deploying search engines to the client side is that the offline index construction method on the cloud side cannot be implemented on the client side.

In order to solve this problem, Huawei first adopted index segmentation to reduce the time of a single disk drop, and released the memory/disk resources occupied by deleted data through regular compression and merging to reduce the required storage space;

Then, by defining the format of the index, information such as location and time is used as part of the index to quickly filter the search conditions and return the results most relevant to the query statement. Compared with database retrieval, the efficiency can be improved by more than ten times.

Almost no calculation time required

However, what is Huawei's purpose in spending so much technical resources to implement a seemingly small "image search" function on the mobile terminal?

Why do we need smart image search?

The direct reason is, of course, that mobile phone users—that is, you and me—really need this feature.

I wonder who has never experienced the scene where you need to transform into Sherlock Holmes to conduct careful analysis because you are looking for a picture:

“When was the last time I saw this picture?” “When was it saved?” “What else did I take that day?”…

But even after thinking about these questions, you may not actually be able to find the picture in the end.

Especially as people store more and more photos in their mobile phones, and the types become more and more complex - not only photos that record life, but also PPTs taken in class, travel guides saved online, etc. piled in photo albums, manually The difficulty of finding is getting higher and higher.

Mobile phone system manufacturers have long noticed this.

Functions such as automatic classification of photo albums, retrieval based on tags, and OCR retrieval of photo text are all appearing on everyone's mobile phones one after another.

However, these capabilities are relatively inflexible and have limited practical effects. Many times they just sit on the phone and "eat dust".

Therefore, making the image search function more intelligent is currently a real need on the user side, and it has also directly driven Huawei to launch the smart image search function.

As for the underlying reasons, there are also internal and external factors driving it.

External factors come from the industry: it is a general trend for search functions to embrace AI.

Through preliminary verification of various industry data, making search more intelligent and efficient can meet the current needs of users and promote the development of the industry.

However, the current scope of coverage is content search on the Internet, and there is another major search scenario in daily life— on-device search , which also requires intelligent upgrades.

Especially as users store more and more files, pictures, audio, etc. locally on their mobile phones/computers and in their personal accounts, and the search operations for personal information increase, this upgrade has become more urgent.

For example, while Microsoft was reforming Bing, it also launched Windows Copilot, replacing the original "Cortana" in one fell swoop. Although they are positioned as AI assistants, they also cover device-side search application scenarios. The biggest difference between the two is that Windows Copilot introduces stronger AI capabilities and is more intelligent.

In short, both internally and externally, it has become the consensus of the industry that search will integrate more powerful AI and develop in a more intelligent, efficient and convenient direction.

Deep internal factors: It comes from Huawei itself.

Smart search is actually launched as part of Huawei’s smart search strategy and blueprint.

The so-called smart search, specifically, is a one-stop aggregation portal , which allows you to directly access various native applications and information content in the fastest way by just scrolling down on the mobile phone desktop, and supports cross-terminal search in all scenarios.

It's positioned to do "my" searches.

The search scope is all kinds of information and functions that users have on the mobile phone, such as picture files, apps, etc.; the search goal is to intelligently identify the user's needs, allowing users to achieve faster and more convenient operations in the "my" field.

The strategy of smart search is to achieve “ native search + ecological search + full-scenario search ”.

When these three are connected, all "my" searches can be covered.

First , local search refers to local application search, image search, file search (including cloud files), search for settings items, search for memos, etc.

For example, in the latest upgraded version, the smart search pull-down can search for cloud disk files in Huawei Cloud Space. You only need to enter the file name keyword to start the search. The scope includes local files saved in the cloud disk, files saved by WeChat/QQ, etc.

The smart image search mentioned above also falls into this category.

In addition, it can intelligently search memos, such as purchase lists, password bills, friends' birthdays and other fragmentary information. If the content is not classified when recording, it will be very troublesome to review previously recorded documents. Now smart search can help people save this step.

The second is ecological content search , including search services and web content, travel, local life, music videos, shopping, etc.

Especially in terms of shopping, it can aggregate high-quality products from all over the network and provide shopping services related to "me".

The third is full-scenario search, that is, cross-device search.

HarmonyOS breaks down the barriers between mobile phones, computers, tablets and other devices, forming a "super terminal".

When logging in to the same account, users can click the search icon in the taskbar control center of the Huawei computer desktop on the PC, or use the shortcut keys Ctrl+Alt+Q to quickly retrieve files on mobile phones and tablets. Including documents, applications, pictures, videos, etc., and supports selecting different types of files for quick preview.

Through the integration of "hardware, software, core and cloud integration" technology and the support of pre-set AI models on the terminal side, cross-terminal searches are ensured without any sense of delay.

In short, whether it is from the most basic user level, the industry level, or Huawei itself, they are pushing the operating system to further upgrade the device-side search experience.

From this, it is not difficult to understand why Huawei launched the smart image search function.

Especially now, after more than ten years of development, mobile operating systems have been relatively complete in terms of functions, content, and ecology. The next upgrades and iterations must be towards more subtle developments.

These small upgrades and changes make the product more moisturizing and silent, and people often have to sigh after using it for a long time: It smells really good.

If viewed from a more macro perspective, these subtle functional upgrades and changes can also "roll" the human-computer interaction experience to a new level and height.

From Huawei's actions, we can see that they have chosen end-side search as one of the entry points, bringing changes from point to surface.

The emergence of smart image search is more like a "prologue", behind which lies Huawei's infinite imagination of smart search, mobile phone systems, and even human-computer interaction.

AI upgrades on the client side, starting from search

It's not just Huawei.

On the one hand, from the perspective of AI technology implementation scenarios , local search and even the specific function of "image search" may be one of the most easily overlooked and important ways to apply AI technology to mobile terminals.

The latest wave of AI is rapidly changing the way search engines interact.

As mentioned at the beginning, both Google and Baidu have participated in this search engine innovation and changed the way of searching on the cloud side. The core is to enable search engines to have natural language understanding capabilities and better recognize and understand user intentions.

But this does not mean that only cloud search engines will be iterated.

Using natural language to search "internal data" on the client side, just like using natural language to ask questions on the cloud, has been one of the hidden needs of users for a long time. With the iteration of computing hardware and optimization of algorithms, the use of AI on mobile devices to improve user experience will inevitably become a new trend.

On the other hand, from the perspective of human-computer interaction , this kind of internal search will not be limited to a single device, but must be multi-terminal interoperable, forming an ecosystem with "people" as the core unit, and ultimately completing global intelligent retrieval.

Nowadays, human beings' imagination of mobile computing platforms has gradually extended from PCs and mobile phones to new terminals such as VR, AR, and smart cars.

On these new mobile computing platforms, the form of interaction is no longer limited to a screen, but becomes more natural language and gesture interaction.

Finally, under the premise of "Internet of Everything", the interoperability of multi-terminal information will be realized.

In short, search is one of the indispensable experience improvement functions on the mobile terminal, whether in terms of AI application or human-computer interaction trends.

No matter what the technology trends are, Huawei is already ready to improve user experience from the mobile side.

-over-

Click here ???? Follow me and remember to star~

Three consecutive clicks of "Share", "Like" and "Watching"

Advances in cutting-edge science and technology are seen every day ~


Latest articles about

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号