Article count:10350 Read by:146647018

Account Entry

Google Gemini Math surpasses o1 preview version! Cost is only 1/10, no extra thinking time is required, the old paradigm is not dead yet

Latest update time:2024-09-25
    Reads:
Xiaojiao is from Aofei Temple
Quantum Bit | Public Account QbitAI

Math beats o1-preview at one-tenth the cost and with virtually no thinking delay!

On the same day that OpenAI's "Her" was fully released, Google Gemini 1.5 underwent a major upgrade.

In addition, the price is half of the original price, the speed limit is increased by 2-3 times, the output speed is increased by 2 times, and the delay is reduced to one third of the original price.

Developers can access it for free through Google AI Studio and Gemini API. The chat version will have to wait.

However, some netizens also found the bright spot. Although his mathematical ability is very strong, he still failed to beat o1-mini and o1 full version (94.8).


Google Gemini 1.5 major upgrade

There are two models updated this time: Gemini-1.5-Pro-002 and Gemini-1.5-Flash-002 .

In summary, the main updates are:

  • For 1.5pro (input and output are both less than 128K), the price reduction is more than 50%.

  • The rate limit is increased by 2-3 times;

  • Output speed increased by 2 times and latency reduced by 3 times;

  • Updated default filter settings.

First, the overall performance is improved, especially in mathematics, long texts, and multimodality.

The performance on MMLU-Pro is improved by about 7%; and in the MATH and HiddenMath (internally retained competition math problem sets) benchmarks, both models have significant improvements of about 20%, with the Pro version surpassing o1-preview (85.5%) with a score of 86.5%.

In addition, there is a 2%-7% improvement in the evaluation of visual understanding and code generation.

Based on developer feedback, both models now feature a cleaner style, with the goal of making these models easier to use and lowering their cost.

For use cases such as summarization, question answering, and extraction, the default output length of the updated models is 5-20% shorter than the previous models.

In terms of price, the 1.5pro input token has been reduced by 64%, the output token has been reduced by 52%, and the incremental cache token has been reduced by 64%, which will take effect on October 1st.

The rate limit has also been increased. The paid rate limit of 1.5 Flash has been increased from 1000RPM to 2000RPM; the rate limit of 1.5 Pro has been increased from 360RPM to 1000RPM.

In addition, the output speed is increased by 2 times and the delay is reduced to one third of the original.

For new models, filters have been switched to optional and are not applied by default.

Finally, there is the Gemini 1.5 Flash-8B experimental version update, which has significant improvements in text and multimodal capabilities.

Netizens tested it out

Some netizens tested it so easily.

He tested the audio transcription function of Gemini 1.5 Flash, which was able to transcribe 13 minutes of audio in 50-60 seconds.