Large model manufacturers are working hard, and Google is also "rolling": Gemini chatbots have replaced the new model and can verify the output content with one click
Cressy from Aofei Temple
Quantum Bit | Public Account QbitAI
While large model manufacturers such as Meta and OpenAI are making intensive efforts, Google also announced a major update:
Starting today, the Gemini chatbot will be powered by Gemini 1.5 Flash .
Compared with the previous version, the window length has been increased by 4 times and the response speed is also faster.
According to Google, the 1.5 Flash model behind the new chatbot focuses on lightweight and speed improvement.
Of course, the quality of model responses has also been improved, and the context window has been increased from the original 8k (based on 1.0 Pro) to 32k.
In addition, the new version of the chatbot has added a "fact-checking" function , which can detect whether the generated content is true with one click, thereby reducing the adverse effects of model hallucinations.
Some netizens lamented that Google's performance today was really strong. First, two Aplha models (which won the IMO silver medal) were released , and then Gemini also released an update.
Some people have even started making a wish, hoping that AI functions can be added to Google Scholar academic searches.
Longer context windows and faster speeds
The most important content of this update is to change the model behind the free version from 1.0Pro to 1.5 Flash.
Gemini 1.5 Flash was first unveiled at the Google I/O Developer Conference in May.
By "distilling" the training data, Gemini 1.5 Flash achieves higher generation quality with a lighter size.
Moreover, its small size makes the model faster and more efficient, and it also supports multimodal reasoning.
Google said that after changing the model, the chatbot will become faster, and the context window of the old version of 8k has been expanded to 32k.
However, Flash 1.5 itself supports 1 million contexts, so this reduction is indeed significant, but it is a free version after all.
In addition to the model upgrade, another important update is the fact-checking function.
In the latest Gemini chatbot, this function can be used to check the output content with one click.
The system will search and compare the content in the output on Google, and then mark the matches and discrepancies .
Some netizens commented that when they saw OpenAI launch GPT-4o mini, they felt that it was only a matter of time before Google launched a new product.
Indeed, not only OpenAI and Google, but also Meta, Mistral and other manufacturers that are working on large models have been active recently.
As for the performance of the model, the netizen also said that he had tried 1.0 Pro and 1.5 Flash, and the performance of the two was almost the same, but 1.5 Flash was faster.
Therefore, this wave of operations by Google is, to a certain extent, in line with the recent trend of "model lightweighting".
So, how does the Gemini chatbot perform after replacing 1.5 Flash?
Check model output with one click
QuantumBit conducted a simple test on the new version of the chatbot.
First, let’s take a look at the updated fact-checking function. The first step is to ask a question like a normal conversation, and Gemini will answer it normally.
You can see a Google logo below the answer, which is the fact-checking button.
After clicking, the system will automatically search on Google and then compare it with your output content.
After the comparison is completed, the content that can be searched and matches the source will be highlighted in green. If there is a discrepancy with the search results, it will be marked with a light red background.
Click the marked location to see the content link Gemini uses for comparison.
It should be noted that such annotation does not mean that the output content is wrong . For example, in the comparative data cited here, Tom Cruise's mother is Marry Lee South.
This part of the answer is marked by the system because of the text mismatch, but in fact both answers are correct.
Since this fact-checking relies on Internet searches, the quality of the comparison materials varies and may not be 100% accurate.
For example, regarding the classic story of "Lin Daiyu uprooting a weeping willow tree", Gemini clearly gave the correct answer, but it was marked in red.
Looking at the comparative information cited again, it is really a bit difficult to hold back.
So The role of this function is mainly to provide a more convenient way to verify, but how to accept it still depends on multiple verifications and the user's own judgment .
In addition, regarding the model itself, we also tested several popular problems that have repeatedly caused large models to run into obstacles.
For example, when comparing numbers , Gemini even converted two numbers into money, but after all the operations, the final result was... wrong.
If this is the second funniest answer since this question was discovered, no model would dare to claim to be the first.
There was also a wrong answer given at the beginning, which was corrected during the subsequent analysis process.
But if you ask the question in English, there is still hope that you can answer it correctly.
There is also the question of numerals . The answer can actually count the letters in Chinese... It is really confusing and totally beyond my expectations.
Finally, regarding the speed improvement mentioned in this update , after testing, it was found that the time it takes for Gemini 1.5 Flash to output the first word is shorter than that of Claude 3 Haiku, and the subsequent speed difference is not very obvious when observed by the naked eye.
The above is the performance of Gemini 1.5 Flash in the chatbot. Interested readers can try it by themselves.
Reference links:
[1]
https://blog.google/products/gemini/google-gemini-new-features-july-2024/
[2]
https://x.com/GeminiApp/status/1816512086232731696
-over-
QuantumBit's annual AI theme planning Now soliciting!
Welcome to submit your contributions to the special topic 1,001 AI applications , 365 AI implementation solutions
Or share with us the AI products you are looking for or the new AI trends you have discovered
Click here ???? Follow me, remember to mark the star~