Latest MLCommons results announced, Intel demonstrates strong AI inference performance
Recently, MLCommons announced the MLPerf inference v3.1 performance benchmark test results for the 6 billion parameter large language model and the computer vision and natural language processing model GPT-J, including the fourth-generation Habana
®
Gaudi
®
2 accelerator
submitted by Intel.
Test
results
for
Intel®
Xeon®
Scalable
processors, and Intel®
Xeon®
CPU
Max series.
The results demonstrate Intel's highly competitive performance in AI inference and further strengthen its commitment to accelerating the deployment of AI at scale in workloads from cloud to network to edge to end.
As the latest MLCommons results show, we have a strong and competitive artificial intelligence product portfolio to meet customer needs for high-performance and efficient deep learning inference and training. At the same time, for artificial intelligence models of all sizes, Intel The product portfolio all has leading cost-effective advantages.
--Sandra Rivera
Intel Executive Vice President
General Manager of Data Center and Artificial Intelligence Division
As verified by MLCommons AI training results1 and Hugging Face performance benchmark2 disclosed in June , Gaudi2 has excellent performance on advanced visual language models, and today’s results further prove that Intel can provide excellent solutions for AI computing needs plan.
Taking into account customers' individual needs, Intel is making AI ubiquitous through products that can help solve inference and training problems in AI workloads. Intel's AI products provide customers with ideal choices that can be flexibly matched to obtain the best AI solutions based on their respective performance, efficiency and target costs, while also helping customers open the ecosystem.
About the test results of Habana Gaudi2:
The inference results of Habana Gaudi2 on the GPT-J model strongly verify its competitive performance.
● Gaudi2’s server query and offline sample inference performance on GPT-J-99 and GPT-J-99.9 are 78.58 times/second and 84.08 times/second respectively.
● The results submitted by Gaudi2 use the FP8 data type and achieve an accuracy of 99.9% on this new data type.
With Gaudi2 software updates released every 6-8 weeks, Intel will continue to demonstrate performance improvements in its products on the MLPerf benchmark, as well as continued expansion of model coverage.
Habana Gaudi2 inference results on GPT-J model
Proven competitive performance
Test results about the fourth generation Xeon Scalable processors:
Intel submitted seven inference benchmarks based on 4th generation Intel Xeon Scalable processors, including GPT-J models. The results show that fourth-generation Xeon processors have excellent performance for general AI workloads, including vision, language processing, speech and audio translation models, as well as the larger DLRM v2 deep learning recommendation model and ChatGPT-J model. Additionally, to date, Intel remains the only vendor to submit public CPU results using industry-standard deep learning ecosystem software.
● 4th generation Intel Xeon Scalable processors are ideal for building and deploying general-purpose AI workloads through popular AI frameworks and libraries. For GPT-J's task of producing a 100-word summary of a press release of approximately 1000-1500 words, the 4th generation Xeon Scalable processor can complete a two-paragraph summary summary per second in offline mode and 100-word summary summary per second in live server mode. A one-paragraph summary.
● Intel has submitted MLPerf results for the first time for the Intel Xeon CPU Max series, which offers up to 64GB of high-bandwidth memory. For GPT-J, it is the only CPU that can achieve 99.9% accuracy, which is crucial for applications that require extremely high accuracy.
● Intel worked with OEMs to submit test results, further demonstrating the scalability of its AI performance and the availability of general-purpose servers based on Intel Xeon processors, fully meeting customer service level agreements (SLAs).
4th Generation Xeon Scalable Processors
Ideal for building and deploying general-purpose AI workloads
MLPerf is the industry's most prestigious AI performance benchmark, designed to enable fair and repeatable product performance comparisons. Intel plans to submit new AI training performance results for the next MLPerf test. Continuous performance updates demonstrate Intel's commitment to helping customers and assisting the evolution of AI technology every step of the way, whether it is low-cost AI processors, or high-performance AI hardware accelerators or GPUs for network, cloud and enterprise users.
Note:
-
https://www.intel.com/content/www/us/en/newsroom/news/new-mlcommons-results-ai-gains-intel.html#gs.51njha
-
https://huggingface.co/blog/bridgetower
Want to see more "core" information
Tell us
with your
likes
and
watching
~
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its affiliates. Other names and brands mentioned in this article are the property of their respective owners.