The most powerful open source programming model changed hands overnight: proficient in 80+ languages, only 22B

Latest update time：2024-05-30

Reads：

Cressy from Aofei Temple
Quantum Bit | Public Account QbitAI

The throne of open source code model changes hands again!

Mistral, known as the "European OpenAI", surpassed Code Llama's 70B parameters with 22B parameters.

The model is named Codestral, which combines the English word "Code" (code) with the company name.

After being trained on more than 80 programming languages, Codestral achieved higher performance with fewer parameters and a window length of 32k, a significant increase compared to previous 4k and 8k models.

And it has been said that the code editing task that both GPT-4o and Claude3-Opus failed on was successfully solved by Codestral.

Some netizens therefore bluntly stated that the launch of Codestral directly rewrote the rules of the game for the multi-language code model.

On the other hand, some netizens directly @ed the well-known local large model framework Ollama, hoping to support Codestral. As a result, Ollama responded quickly and added support for Codestral one hour after the request was sent.

So, what results did Codestral achieve in the test?

The new king of open source programming models

Codestral has a parameter size of 22 bytes and supports a context window of 32k.

During the development process, the researchers trained Codestral using code data from more than 80 programming languages.

These include popular languages such as Python, Java, C++ and Bash, as well as older languages such as Fortran and COBOL.

It is worth mentioning that COBOL was born in 1959, but 43% of the world's banking systems still rely on it. On the other hand, there are only a few people who can use it now, and most of them are old.

AI tools' support for COBOL may become a way to solve the extreme shortage of COBOL talents.

Back to Codestral, although the number of parameters is less than one-third, its test results have greatly exceeded the 70B Code Llama.

For Python, the team used HumanEval (pass@1) and MBPP to evaluate Codestral’s Python code generation capabilities, CruxEval to evaluate output prediction, and RepoBench to evaluate Codestral’s code completion capabilities in remote repositories.

As a result, Codestral achieved the best results in three tests and surpassed Llama 3 and Code Llama in all aspects.

In terms of database, in the Spider test for SQL, Codestral's performance is also very close to the general model Llama3.

For some other programming languages, Codestral and the general version of Llama3 have their own wins and losses, and the average score is slightly higher than Llama3, but the advantage over Code Llama is very obvious.

In addition, Codestral also supports FIM (fill-in-the-middle) , which means that existing code can be filled in and completed.

In the three languages of Python, JS and Java, Codestral achieved HumanEvalFIM scores close to or above 90%, with an average score of 91.6%, surpassing DeepSeek Coder 33B, which has a larger number of parameters.

In terms of speed, using the online conversation version, it only takes three seconds to build an HTML frame with a top banner and sidebar.

Not only does Codestral have excellent performance, but it also supports a variety of usage methods.

Mistral has uploaded the model weights to HuggingFace. If you have the conditions, you can download and deploy it yourself.

Codestral is also supported in large model frameworks such as LangChain, LlamaIndex, and Ollama mentioned at the beginning, as well as Mistral’s own developer platform La Plateforme.

A dedicated API is also on the way and is currently undergoing an 8-week test, during which developers can use it for free.

If you still don’t know how to deploy it, you can also go to Mistral’s online conversation platform Le Chat and use the web page to communicate directly.

Of course, what developers may be more concerned about is whether it can be integrated into IDE for use.

In this regard, the official has not yet launched native IDE support, but third-party plug-ins such as Continue.dev and Tabnine already support Codestral, which can be used in VSCode and JetBrains series IDEs through these plug-ins.

One More Thing

Along with Codestral, Mistral also announced its new “non -production ” license agreement, abbreviated as MNPL.

The license agreement used by CodeStral released this time is also MNPL. According to regulations, it can only be used for research purposes and cannot be used for commercial purposes.

Moreover, the agreement has a very strict definition of "non-commercial" and it is not allowed even if it is only used for internal company affairs.

Some open source authors complained about this, saying that they never asked for my opinion when using my code, so why did they ask me to abide by their rules in return? This is really ridiculous.

Mistral's explanation is that if commercial use is allowed, it may not be possible to obtain users' contributions to model development.

The official also stated that although Codestral cannot be used commercially, it does not mean that other open source models will be the same in the future. At the same time, it was made clear that other models based on the Apache 2.0 protocol will continue to be released in the future.

Reference links:
[1] https://mistral.ai/news/codestral/
[2] https://x.com/GuillaumeLample/status/1795820710750744839
[3] https://www.theverge.com/2024/5/29/24166334/mistral-debuts-a-coding-assistant-called-codestral

-over-

QuantumBit's annual AI theme planning Now soliciting!

Welcome to submit your contributions to the special topic 1,001 AI applications , 365 AI implementation solutions

Or share with us the AI products you are looking for or the new AI trends you have discovered

Click here ???? Follow me, remember to mark the star~