Nvidia's stock price has been cut in half? Nvidia will return to its peak with these products!
The past two months have been a nightmare for Nvidia investors.
On October 1, the company's stock price closed at $289.36, setting a record high. But since that day, the company's stock price has plummeted. After the Q3 financial report was released on November 15, the stock price plummeted by 19%, and the market value evaporated by more than 20 billion US dollars in one day. As of today (November 21), Nvidia's stock price has reached 149.08, a sharp drop of 48.5% from its peak, and its market value has been halved, returning to the level of July 2017.
Nvidia's stock price trend over the past year (source: Yahoo Finance)
Tianfeng Overseas said that overall, Nvidia's Q3 revenue, profit and guidance all failed to meet the target: Nvidia's Q3 gaming business revenue only increased by 13% year-on-year, mainly due to the pressure of channel inventory backlog brought by the elimination of the mining market, and the growth of the main business may be weakened; the data center business increased by 58% year-on-year to US$790 million, and the growth rate slowed down and was also lower than expected; the automotive business revenue was US$172 million, up 19% year-on-year, better than expected but still small in size. The Pro Vision business increased by 28% year-on-year to US$305 million, and the OEM&IP business fell by 23% year-on-year to US$148 million due to the decline in digital currency prices.
In a recent report sent to clients, Goldman Sachs also pointed out that it had removed Nvidia from its "conviction buy list", saying that it had clearly made a "wrong" judgment on this stock, having previously greatly underestimated the increase in channel inventory and the adjustments in its gaming business.
In response to this situation, Nvidia's CEO said at the earnings conference, "The 'hangover effect' of the cryptocurrency craze has lasted longer than we expected, and we are surprised by this, but it will eventually pass." Jay Puri, the company's executive vice president of global market operations, also emphasized at yesterday's GTC China 2018 conference that the fluctuations in stock prices are partly due to the problems in the "cryptocurrency market" just discussed, and partly due to the impact of the overall economic market environment.
He even said bluntly: "In fact, stock price is not the goal we strive for at NVIDIA. We believe that it is very important to continuously promote the development of the entire computing industry. As long as we continue to serve games, high-performance computing, artificial intelligence, autonomous driving, intelligent robots and other continuous innovations. We believe that if these tasks are done well, the stock price will not be a problem."
Judging from Huang Renxun’s introduction at GTC China 2018, NVIDIA seems to have already started from the software and hardware aspects and is ready for their next wave of growth.
Hard Strength 1: Turing Architecture, a decade of hard work
When Huang Renxun and two engineers from Sun Microsystems, Chris Malachowsky and Curtis Priem, founded NVIDIA in 1993, their original intention was to develop a dedicated chip to speed up the rendering of 3D images in video games and bring more realistic effects. After launching GeForce 256 in 1999 and proposing the concept of GPU for the first time, NVIDIA has always taken GPU as the core business of the company. The emergence of GPU has also completely changed the actual computer graphics technology.
"After ten years of hard work, our newly launched Turing architecture is leading a new round of breakthroughs in computer graphics technology," Huang Renxun said at GTU China. According to his previous statement, this is their biggest leap forward since the invention of CUDA GPU in 2006.
Jensen Huang at GTC China 2018
Huang Renxun said at the meeting that unlike the Pascal architecture GPU which only has a basic processor (i.e. programmable shader), Turing includes three processors: programmable shader processor, RT Core and Tensor Core. The introduction of RT Core, which focuses on accelerating ray tracing, has sparked widespread discussion in the industry.
According to Huang Renxun, this core can simulate the light of real objects, reflecting around the scene, illuminating objects, changing color tones, and finally presenting them to everyone. In other words, billions of rays per second make the graphics lifelike and make real-time tracking possible. Industry experts also believe that the introduction of RT Core has greatly improved the efficiency of ray tracing on the Turing architecture. It is a function that originally required tens of thousands of dollars of DGX Station to run in real time, but can now be realized on a few hundred dollars of Turing GPU. Most importantly, the performance of the new architecture may be even higher.
"The introduction of Tensor Core allows deep learning, neural networks, and artificial intelligence to run on the GPU at an unparalleled speed," Huang Renxun said at the meeting.
In fact, this core appeared on Volta as early as NVIDIA's dedicated ASIC unit designed for deep learning applications, but Turing is the first architecture to bring Tensor Core to consumer-level GeForce graphics cards. From a structural point of view, it is a matrix multiplication and accumulation computing unit that can perform two 4×4 matrix multiplications and additions with another 4×4 matrix in one clock cycle. That is, 64 multiplications and 64 additions can be achieved in one clock cycle. Its addition is of great significance to graphics rendering not only in that it can greatly improve the efficiency of image processing based on deep learning methods on the GPU, but also allows us to have more computing performance to process images based on neural network algorithms.
Huang Renxun emphasized at the meeting that Tensor Core has many uses. The first is that it can generate some graphics that were difficult to achieve in the past (such as reflections and shadows). And DLSS (deep learning super sampling) plays an indispensable role in this. DLSS is a neural network model that runs on Tensor Core. By training this neural network model, it learns to make an image more beautiful. It is precisely because of the neural network model that after multiple trainings, it will know how to enhance the image and make it more beautiful. "Using this technology, we can render a smaller image, save computing power, and make full use of the 114 trillion times of Tensor Core processors, while achieving high image quality and high frame rate." Huang Renxun added.
Performance comparison between Turing and Pascal
In addition to the above mentioned points, the Turing architecture also brings innovations such as Mesh Shading, Variable Rate Shading and Texture Space Shading. Multiple technologies also enable the new architecture to achieve more than ten times the performance of Pascal, thus bringing a better experience in graphics computing.
Hard Power 2: Reshaping the Future of Computing
At the conference, Huang Renxun repeatedly emphasized that Moore's Law has become invalid. He pointed out that in the first few decades after Gordon Moore proposed Moore's Law, the performance of processors was guided by this law and increased 100 times every ten years. However, in the past decade, the performance of processors has increased by far less than ten times (only two to three times), which has brought huge cost increases to Internet companies that need powerful performance. What's more serious is that in ten years, due to the promotion of artificial intelligence, the entire industry may face a shortage of computing power. Nvidia had foreseen this situation ten years ago and invested in accelerated computing to deal with the computing power "crisis."
NVIDIA Accelerated Computing
He said that accelerated computing is a full-stack issue. We can't just put GPUs or ASICs or other things under the software and expect performance to improve. We need to have the expertise to redesign the top-to-bottom software stack. To do this, we must understand the software, applications, and algorithms from the bottom up, and then accelerate it from the bottom up. This is exactly what NVIDIA is good at. HGX-2 GPU, Turing T4, and AGX are the hardware that NVIDIA has prepared for this market.
First, let's look at HGX-2. It is reported that HGX-2 has features such as NVIDIA's NVSwitch interconnect structure, connecting 16 NVIDIA Tesla V100 Tensor Core GPUs together to form a giant GPU, which can provide 2 petaflops of AI performance in a single node. HGX-2 also has 0.5 TB of memory and 16 TB/s of total memory bandwidth.
NVIDIA HGX-2 Server
As the world's most powerful multi-precision computing platform, HGX-2 can use FP32 and FP64 high-precision computing for scientific computing and simulation, and can also use FP16 and INT8 precision for AI training and reasoning. The floating-point performance is as high as 2PFLOPS, which is 20 quadrillion times, which is among the best in the TOP500 supercomputers. Compared with servers that only use CPUs, HGX2 increases the running speed of AI machine learning workloads by nearly 550 times, the running speed of AI deep learning workloads by nearly 300 times, and the running speed of high-performance computing workloads by nearly 160 times.
After previously winning customers such as Foxconn, Inventec, Quanta Computer, AMD, Wistron and Wiwynn, NVIDIA announced yesterday that HGX2 has established new partnerships in China with Baidu, Tencent, Inspur, Lenovo, Huawei and Sugon. Ian Buck, vice president and general manager of accelerated computing at NVIDIA, said: "China's leading technology companies are rapidly taking advantage of HGX-2, the most powerful cloud node in history. With the unparalleled computing power and universal design of HGX-2, companies in China and around the world can now build new scalable products and services to solve huge computing challenges and some of today's most pressing problems."
The Turing T4 cloud GPU is a product launched by NVIDIA to meet the unique needs of scale-out public and enterprise cloud environments, maximize throughput, utilization, and user concurrency, and help customers efficiently cope with the explosive growth of users and data. This compact 70-watt T4 GPU is roughly the size of a chocolate bar and can be flexibly adapted to standard servers or any open computing project's hyperscale server design. Server designs can range from a single T4 GPU to 20 GPUs in a single node.
Huang Renxun said that Internet companies such as search, social media and online shopping sites are early adopters of T4 and are the largest end-customer group. The first batch of Chinese companies that began to use T4 to expand and improve workload horizontal expansion include Baidu, Tencent, JD.com and iFlytek. China's leading computer manufacturers will also launch a series of T4-based servers, including Inspur, Lenovo, Huawei, Sugon, Inspur Commercial Machine and H3C.
Hard power three: Using AI to help achieve automation
The rise of AI is helping to solve medical and other problems, and it is also pushing for automation. Machine vision and driverless cars are all invented for the future automated society. Xavier is Nvidia's great contribution to these markets.
This chip was released by Nvidia at CES held in January this year. According to reports, this is a product they developed specifically for the autonomous driving market, and the R&D expenditure has reached 2 billion US dollars. Data shows that the chip is 350 square millimeters in size, contains 9 billion transistors, including 1 Volta Tensor Core GPU, 1 8-core ARM64 CPU, 2 NVDLA deep learning accelerators, 1 image processor, 1 visual processor and 1 video processor. It can perform 30 trillion operations per second, with a power of only 30 watts, and the energy efficiency is 15 times higher than the previous generation architecture. The product has now been mass-produced. According to Nvidia, this chip is two years ahead of its competitors. In the face of different automation applications, Nvidia has launched two platforms, Nvidia Drive AGX and Jetson AGX Xavier, of which the former is for autonomous driving and the latter focuses on unmanned delivery vehicles.
NVIDIA DRIVE Xavier
First, let's take a look at NVIDIA Drive AGX, which is the new name of its Xavier-based product line, which includes Drive Xavier and the newly launched Drive Pegasus. Huang Renxun said at the meeting that new car companies including Xiaopeng Motors, Singularity Motors, SF Motors, and autonomous driving full-stack solution companies such as Zhijia Technology, TuSimple, and Auto X have adopted NVIDIA's NVIDIA DRIVE AGX chip solution. FAW Jiefang, Zhijia Technology and Manbang Group also jointly announced that they will establish a cooperative relationship with NVIDIA to promote the research and development and implementation of China's unmanned heavy-duty truck autonomous driving technology. The four parties will give full play to their respective advantages, strengthen information and resource sharing, and promote innovation in the unmanned heavy-duty truck industry chain. At the same time, make full use of the powerful performance of NVIDIA DRIVE AGX Pegasus, develop and improve multi-sensor fusion and artificial intelligence solutions, and promote the development of smart truck technology.
Products under NVIDIA Drive AGX platform, where "PX" has been renamed "AGX
As for NVIDIA Jetson Xavier, it was launched at the Taipei International Computer Show in June this year. Jetson Xavier has 6 high-performance processors, including 1 Volta Tensor Core GPU, 1 8-core ARM64 CPU, 2 NVDLA deep learning accelerators, 1 image processor, 1 vision processor and 1 video processor. This makes it the choice for the next generation of delivery robots.
Nvidia Jetson Xavier SOC
It is reported that it has high performance and high energy efficiency, and can handle all these computing tasks in real time, allowing the delivery robot to safely and autonomously perform operations. The module can provide powerful workstation-like processing power at a speed of up to 32 trillion operations per second, and its energy efficiency is 10 times higher than its predecessor, while its size is only the size of the palm. Huang Renxun also announced that in this regard, he will establish in-depth cooperation with Meituan and JD.com.
Software is NVIDIA's core competitiveness
Nvidia executives repeatedly emphasized the company's competitiveness in software when they were interviewed by Semiconductor Industry Watch and other media yesterday. They said that no matter what other competitors compete with them in software, software will often become their weak link.
Taking Google TPU as an example, Jay Puri said that the chip launched by the search giant can only process a certain AI model, but we know that AI is still in a very early stage of development, and new AI frameworks and models will continue to emerge, which limits the development of TPU. However, NVIDIA provides a very top-level support-type accelerated computing platform, which can support various new AI frameworks or models in the future. In terms of programming, we also have high flexibility, which is also our advantage.
When talking about NVIDIA's software, we have to mention CUDA, which was launched by NVIDIA in 2006.
According to Wikipedia, this is an integrated technology launched by NVIDIA, which is the company's official name for GPGPU. Through this technology, users can use NVIDIA's GeForce 8 and later GPUs and newer Quadro GPUs for computing. It is also the first time that GPUs can be used as a development environment for C-compilers. In a way, NVIDIA's dominance today is related to the maturity of cuda. Huang Renxun also said that the number of downloads of NVIDIA's CUDA SDK has reached nearly 14 million, and it has reached 6 million last year alone. We continue to expand and enrich its capabilities with each generation of products, giving developers more vitality.
RAPIDS, an open source GPU acceleration platform for the Machine Learning and big data processing markets launched by them in September this year, is another software "weapon" of the company. It is reported that the RAPIDS open source GPU acceleration platform is built on popular open source projects such as Apache Arrow, pandas and scikit-learn, bringing GPU speedup to the most popular Python data science tool chain. Tests show that RAPIDS is 50 times faster than systems with only CPUs, which can reduce the data processing time of data scientists from days to hours or from hours to seconds. The report pointed out that RAPIDS has provided a complete set of open source libraries for GPU accelerated analysis and machine learning, and data visualization will soon be its next goal.
The NGC-Ready program announced last week enables customers with powerful systems based on NVIDIA GPUs to confidently deploy GPU-accelerated software on a wider scale. At GTC China yesterday, NVIDIA announced more new NGC-Ready systems from leading Chinese computer manufacturers. NVIDIA has many other great products in software, so I won't list them all here.
Looking at the current state of the industry, whether it is the external economic environment and market impact mentioned above, or Huawei's entry into the AI chip field, they have more or less brought impacts to Nvidia. After this wave of decline, whether they can make a comeback with the above products depends on how they play these cards in the future. Of course, it is necessary to keep an eye on the performance of competitors.
By Li Shoupeng, Semiconductor Industry Observer
*This article was originally created by the public account Semiconductor Industry Observation (ID: icbank). If you need to reprint, please add WeChat ID: icbank_kf01, or reply to the keyword "reprint" in the background of the public account, thank you.
Today is the 1774th issue of content shared by "Semiconductor Industry Observer" for you, welcome to follow.
Recommended Reading
★ Lin Benjian: Shrinking semiconductor components to one fortieth of the wavelength of light
★ The SiC device war is about to break out
[Benefits] Get 4.7G selected materials + 30 days of learning network VIP for free
Follow the WeChat public account Semiconductor Industry Observer (ID: icbank), and reply to the following keywords to get more relevant content
Gallium Nitride | China-US Trade | Silicon Carbide | Financial Report | Passive Components | Open Source | RF | 5G | Exhibition
Reply to the submission and see "How to become a member of "Semiconductor Industry Observer""
Reply to the search and you can easily find other articles that interest you!
About Moore Elite
Moore Elite is a leading chip design accelerator that reconstructs semiconductor infrastructure to make it easier for China to make chips. Its main businesses include "chip design services, supply chain operation services, talent services, and enterprise services". It covers more than 1,500 chip design companies and 500,000 engineers in the semiconductor industry chain, and has precise big data on integrated circuits. It currently has 200 employees and is growing rapidly. It has branches and employees in Shanghai, Silicon Valley, Nanjing, Beijing, Shenzhen, Xi'an, Chengdu, Hefei, Guangzhou and other places.
Click to read the original article to learn more about Moore's elite