The digital wave is sweeping every corner of the world. Behind all these changes are closely related to small chips. From mobile phones and computers to airplanes, cars, and household appliances, from robotic arms and cutting machines in factories to ventilators and ultrasound machines in hospitals, from the bus card in your hand to the satellite in the sky, chips are everywhere.
The accelerated arrival of the AI era, especially the rise of large-scale models, has brought unprecedented demand for computing power, which is both a challenge and an opportunity for the chip industry.
In the past, the protagonist in the chip field was undoubtedly the CPU (Central Processing Unit). However, since the GPU (graphics processing unit) can better meet the needs of high-performance computing in the AI era, its popularity has soared in recent years. Not only the stock prices of index companies have continued to climb to new highs, but also a large number of start-up companies and capital are looking for opportunities and chasing dreams here.
Why GPU counterattack rise? The chip is going to change the protagonist?
GPU ferocious, Nvidia Tera
At the beginning of this year, Nvidia’s stock price was around $150, and it once soared to $480 in July this year. It is currently (as of August 7) around $450, with a total market value of more than $1.1 trillion. This means that in the first seven months of this year, Nvidia’s market value has increased by a terrifying more than 760 billion US dollars, an increase of more than 210%. At present, Nvidia is the first chip company with a market value of more than one trillion US dollars, and it is also the sixth largest company in the world by market value. If calculated from the listing, its share price has increased by more than 1100 times.
On May 25 this year, Nvidia released a staggering first-quarter financial report. The revenue guidance for the second quarter in the financial report “brightened the eyes of Wall Street.” As a result, in just one day, Nvidia’s market value soared by more than 200 billion U.S. dollars. And there aren’t many companies on this planet with a market capitalization over $200 billion.
The east and west of GPU and CPU can be more intuitively felt by comparing the market value of “the chip king in the AI era” Nvidia and “the chip king in the PC era” Intel. Intel’s current market cap is around $147 billion, and Nvidia is more than seven times that. The other AMD among the “Big Three Chips”, because it has also enjoyed the dividends of the AI wave, its stock price has risen by more than 35% this year. Its current market value is around 188 billion US dollars, and it is still the second largest.
What is even more astonishing is that the current price-earnings ratio of Nvidia has exceeded 200 times, Tesla on the cusp of the storm is only more than 60 times the price-earnings ratio, Microsoft and Meta are less than 40 times, Google is less than 30 times, and Alibaba is less than 25 times. Tencent is only 15 times… Even TSMC, which can “stuck Nvidia’s neck” (Nvidia chips are produced by TSMC), is only about 15 times.
The reason why the capital market is so excited is that Nvidia meets the two characteristics that investors like most: popular industry + dominance. According to the global GPU market data report recently released by Jon Peddie Research (JPR), Nvidia ranks first with a market share of 84%, followed by AMD with 12% and Intel with 4%.
Isn’t a GPU just a graphics card? In the Intel era, the GPU didn’t even have a “separate name” and was packaged in the CPU. In a sense, it can be said that Nvidia invented the GPU and made it stand alone. So why do GPU and Nvidia seem to be on a rocket? The answer actually lies with OpenAI and ChatGPT.
As large models such as ChatGPT set off an AI frenzy, the whole world is excited about this “rare technological revolution in human history”. But why is a GPU company the biggest winner?
This is because Nvidia is the “arms dealer” behind the big AI model, and a Bank of America report even called it “the shovel king in the AI gold rush era.” Because no matter whether you are a “hundred model battle” or a “thousand model group dance”, all of them must run on Nvidia’s GPU.
Nvidia officials told China Economic Weekly that as early as 2016, Nvidia delivered the world’s first DGX-1 supercomputer to OpenAI. At the end of 2022, ChatGPT created by OpenAI will gain 100 million users in just two months. Its popularity proves that the “AI iPhone moment” brought about by generative AI and accelerated computing has arrived.
According to the person, Nvidia invented the GPU as a parallel processor to simulate video games and movies so that they are as realistic as the real world. While GPUs were originally designed to process pixels for 3D graphics, they are also very good at processing data, which makes them ideal for deep learning tasks.
”As early as 10 years ago, artificial intelligence researchers began to use GPUs for deep learning. In 2011, researchers found that 12 NVIDIA GPUs can provide the deep learning performance of 2000 CPUs. In addition, NVIDIA has also improved GPU design, system architecture And software, and accelerated the training speed, so that the performance of the GPU more than doubles every year, faster than Moore’s Law.” The above-mentioned person said.
In addition, GPUs can simulate human intelligence, run deep learning algorithms and act as the brains of computers, robots and self-driving cars that perceive and understand the world, the person said. In the future, Nvidia will also be committed to helping customers use accelerated computing to achieve breakthroughs in generative AI and large language models.
”CPU is a general-purpose processing unit. You can understand it as a ‘big housekeeper’, who has to manage everything. Usually, 25% of all CPU modules are used as arithmetic units (ALU), and 25% are used as control units (Control) , 50% are used for cache unit (Cache); GPU is a single graphics processing unit, 90% of the modules are used as computing units, and only 10% are used for control units and cache units. Furthermore, the computing mode of GPU is parallel processing, That is, many things can be done at the same time; while the CPU is serial processing, that is, after one thing is processed, another is processed. Therefore, in the face of huge AI computing power requirements, GPUs with stronger computing power and higher computing efficiency become The mainstream choice.” Wang Bo, a senior chip research expert and author of “A Brief History of Chips”, told the reporter of “China Economic Weekly” that he has more than 20 years of research and teaching experience in the field of chips.
Peng Hu, chief analyst of the science and technology hardware industry of CICC Research Department, analyzed to the reporter of “China Economic Weekly” that technically, the CPU adopts the Von Neumann architecture, and the efficiency is limited when processing a large amount of data in parallel. Compared with traditional CPUs, GPUs have the advantages of multi-threading, high core count, higher storage access bandwidth and speed, and strong floating-point computing capabilities, and have gradually developed into the mainstream form of AI computing power chips.
”Of course, general-purpose GPUs are also increasingly facing the contradiction between high efficiency and high power consumption. Specialized AI computing power chips (such as TPU, NPU, IPU, etc.) are gradually using their own characteristics of high computing power and low power consumption. , becoming one of the chip solutions for some Internet cloud service providers to provide AI computing power.” Peng Hu said.
If we take the Turing test as the starting point, the development history of artificial intelligence has been more than 70 years, and the ebb and flow of several ups and downs are mostly limited by computing power, either insufficient computing power or too high cost. Therefore, many people ridiculed ChatGPT as a “miracle with great efforts”, one is because Nvidia has created a GPU with powerful computing power, and the other is that it can pile up tens of thousands of expensive chips. With the powerful computing power, a new wave of AI will follow.
The AI story of GPU and Nvidia is far from reaching its climax?
Although the turbulent AI wave has brought amazing growth to Nvidia, in fact, before that, Nvidia has already enjoyed a wave of epic digital currency dividends, because GPUs with stronger computing power are also the first choice for “mining”. And the GPU and Nvidia “AI story” is far from reaching its climax.
According to TrendForce, global AI chip shipments will increase by 46% in 2023. Among them, Nvidia GPU is the mainstream in the AI server market, with a market share of about 60% to 70%.
According to a research report released by Mizuho Securities, Nvidia’s revenue this year may reach 25 billion to 30 billion US dollars. By 2027, its AI-related revenue will reach 300 billion U.S. dollars, and Nvidia’s market share in the global AI server chip market will be around 75%.
Such an attractive huge market will naturally face increasingly fierce competition. The “Big Three Chips” that have been wrestling for decades, the fierce battle between Intel (founded in 1968), AMD (founded in 1969) and Nvidia (founded in 1968) is destined to continue into the AI era.
In June of this year, AMD launched the data center APU (accelerated processor) Instinct MI300, aggressively entering the AI market. At the just-concluded second-quarter earnings conference call for AMD, AMD CEO Lisa Su revealed that the number of AMD’s AI data center chip customers “increased by more than 7 times” this quarter, and it is expected that the business’ performance in the second half of the year will have a 50% increase. increase.
Intel spent $2 billion to acquire the Israeli AI chip company Habana in 2019, and has continued to complement various sectors of the AI business. The second-generation deep learning chip Habana Gaudi2 launched by Intel this year is the standard Nvidia 100 series, which is specially built for training large language models. Intel also expects to complete the integration of the Gaudi AI chip and GPU product lines by 2025, and launch a more complete and competitive next-generation GPU product.
On May 29, 2023, NVIDIA founder and CEO Jensen Huang announced a batch of new products and services related to artificial intelligence at the COMPUTEX conference.
Engaging in AI chips has also become a must for major technology companies around the world. After all, no one wants their computing power to be in the hands of Nvidia. Although Microsoft bought tens of thousands of Nvidia chips just for ChatGPT; Musk, who called for the suspension of AI research and development, also quietly hoarded 10,000 Nvidia A100s; Microsoft, Google, Meta, and Tesla also all ended, self-developed AI chip.
Wang Bo believes that in the global GPU market, Nvidia is indeed the only one at present. Moreover, Nvidia’s moat is not only the chip itself, but also its own development system, that is, the CUDA computing platform and the software and hardware ecology. “It’s a bit like Apple’s advantage in hardware such as the iPhone and its strong iOS software ecosystem,” he said.
Nvidia founder and CEO Huang Renxun revealed in May this year that CUDA has more than 4 million developers and more than 3,000 applications worldwide. CUDA has been downloaded a total of 40 million times, with 25 million times last year alone. 40,000 large enterprises around the world are using NVIDIA products for accelerated computing, and 15,000 startups have built on NVIDIA platforms.
When a hardware company you think of as a software company says it is a software company, you have to think about it. Just like when the first-generation iPhone was released in 2007, someone asked Jobs: How does Apple prevent the iPhone from being imitated and falling into price competition? Jobs’ answer was; “We are a software company.”
”If other companies launch new GPU chips, developers need to learn a new development language, which is very painful, just like changing a person’s language and even the way of thinking. Therefore, the software ecology that matches the hardware is a very important competitive barrier for leading chip companies.” Wang Bo said.
Huang Renxun proposed a famous “Huang’s Law”, that is, the performance of GPU chips will double every 6 months, and the speed is three times that of Moore’s Law. This means that GPU is a track that needs to be run wildly. The speed of life and death, the winner takes all, is destined to belong to the big players who dare to take risks, and it is easy to become dust and great.
Of course, Nvidia is not able to lie down and win. Throughout the history of chip development, it has happened many times that a little-known small company has risen rapidly because of a genius chip design solution. This is actually the “script” of Nvidia.
AI achieves GPU, but GPU is not the perfect answer to AI?
In fact, GPU is not born for AI, it is just a solution to solve the computing power demand of AI, is there a better one? The answer is yes, but it is not yet known who is the next chip protagonist to subvert the GPU.
A person in charge of GPU product design of a domestic GPU manufacturer told China Economic Weekly that the chip is the basic component of computing power, and the core calculations are all run on the chip. The logic of the entire hardware is that chips are deployed in servers in the form of boards or other computing power cards, servers are placed in cabinets, and a large number of cabinets form a data center. The rise of large models has undoubtedly brought a huge potential market to the chip industry, and also brought many technical requirements, especially for the key performance indicators such as single-card computing performance and interconnection capabilities of the chip. Therefore, the market needs to develop stronger chip products.
The person in charge believes that GPU has become the mainstream because in the early stage of AI development, the most suitable chip architecture that can be obtained is GPU, so it has a first-mover advantage. But the core pain point of GPU is that chip manufacturing technology cannot keep up with the increase in computing power demand, that is, the often said Moore’s Law has come to an end. At present, the most advanced process used by GPU chips is the 4nm-5nm process, which is very close to the physical limit of Moore’s Law. In the future, it will be almost impossible to improve chip performance through process upgrades.
In addition, the person in charge said that the traditional GPU still retains many unnecessary graphics computing functions, so that the computing efficiency of the entire chip is not the highest, which is also the disadvantage of the GPU. Other mainstream AI chip solutions are not perfect. For example, the application-specific chip (ASIC) solution developed for AI computing has higher computing efficiency but poor versatility.
”The most promising breakthrough in the future lies in newer packaging technologies (such as 3D packaging), newer materials, etc., trying to break through Moore’s Law.” The person in charge said.
Wang Bo made a further explanation from the perspective of chip architecture principles. He said that due to its architecture itself, it is not perfect to use GPU to solve AI computing power. After all, GPU was not originally born because of AI. In addition, GPU computing and storage are separated, and data needs to be called back and forth between computing and storage. This kind of data handling consumes 10 times the energy of computing. Moreover, the GPU often waits for the data to come before performing calculations. Therefore, the calculation efficiency of the GPU is not high, and the power consumption is very large. The powerful computing power of the GPU requires a huge cost.
”In chip design, we have been looking for a PPA compromise, that is, the balance point between performance (Performance), power consumption (Power) and area (Area), because the three cannot be optimal at the same time. Performance and power consumption have always been a pair Contradictions, and the larger the area, the higher the cost of the chip.” Wang Bo said.
Wang Bo also said that in fact, researchers in academia and technology companies are working on chips that are more suitable for artificial intelligence, using new principles and new materials. For example, a chip that integrates storage and calculation can complete the calculation inside the memory without moving data, so that it can achieve lower power consumption but greater computing power. “At present, although it is still in the exploratory stage, the good news is that in this field, China and the world are in sync.” He said.
Another idea is to change the von Neumann architecture of the chip and simulate the neuromorphic chip of the human brain data processing method. “Neuromorphic chips have been developed for decades. Although there is no way to compete with GPU in computing power at present, if its computing power can reach half that of GPU, it may emerge suddenly by virtue of its energy consumption and cost advantages.” Wang Bo said.
Peng Hu also analyzed that the GPU has powerful parallel computing capabilities and efficient floating-point computing capabilities, and is a general-purpose chip that can better meet the requirements of various AI algorithms, but there are also high power consumption and low computing power utilization. The short board of the rate. In addition to GPUs, AI chips also include FPGAs and various ASIC solutions. FPGA is an integrated circuit with programmable hardware structure. Its programmability and flexibility can quickly adapt to the requirements of different AI algorithms, but it also has the problem of high power consumption. ASIC is a special-purpose chip, which achieves higher algorithm utilization and energy consumption ratio through algorithm solidification, but has a long development cycle and weak flexibility.
”We believe that the current GPU is still a mature one-stop solution that satisfies large AI models and supports multi-modality. ASIC will occupy a place in the future AI market due to its high cost performance and high energy consumption ratio.” Peng Peng Tiger said.
In fact, energy consumption has become an important bottleneck in the development of computing power and even the development of AI. Lin Yonghua, deputy director and chief engineer of Beijing Zhiyuan Artificial Intelligence Research Institute, told the reporter of “China Economic Weekly” that the electricity cost of a tens of billions-level large-scale model training is more than 100,000 yuan per day.
”A large model of more than 100 billion levels requires 1,000 to 2,000 A100 cards for training, and the hardware cost is about 50 million U.S. dollars. It also needs to invest in manpower, electricity, and network expenditures. The annual cost is at least 50 million U.S. dollars to 100 million U.S. dollars.” Fang Han, CEO of Kunlun Wanwei, once said.
The person in charge of a leading domestic AI computing power supplier told the reporter of “China Economic Weekly” that for a traditional data center, electricity costs account for 60% to 70% of the total cost of operation and maintenance. For every kilowatt-hour of electricity consumed, only half is used for business (calculation), and the rest is wasted for heat dissipation. Therefore, new data centers generally use liquid cooling technology, which can save more than 30% of electricity costs compared with air cooling.
The world’s largest Internet companies are trying every means to solve the heat dissipation problem. In order to reduce energy consumption, they bury their data centers in mountains (Tencent), soak in lakes (Ali), throw them into the sea (Microsoft), and pull them to the North Pole ( Meta)…
Chip blockade intensifies, how about China’s “AI core”?
After the ZTE incident in 2018 and the Huawei incident in 2019, “chip” has not only become a national buzzword, but also a large amount of capital and start-up companies have entered the chip field. But at that time, there were still many directions for AI chips. In addition to GPUs, there were also FPGAs (Field Programmable Gate Arrays) and ASICs (Application Specific Integrated Circuits). However, under the market demonstration effect of Nvidia, the second wave of chip entrepreneurship in 2020 will mainly focus on the GPU field, especially GPGPU (a general-purpose graphics processor that can be programmed to perform different computing tasks).
In August 2022, the U.S. government banned Nvidia from exporting the most advanced high-computing GPUs A100 and H100 to China. Nvidia then launched A800 and H800 chips that meet U.S. export control requirements for the Chinese market (limiting some performance of A100 and H100). .
According to sources quoted by the media, this year, major Internet companies in China are frantically hoarding Nvidia GPUs. ByteDance alone has ordered more than US$1 billion of GPUs from Nvidia. This figure is close to Nvidia’s sales in the Chinese market in 2022. The sum of commercial GPUs, and ByteDance has also acquired almost all publicly available A100 chips on the market.
According to Nvidia’s official website, the A100 chip is priced at US$10,000 per piece (there will be a certain discount for bulk purchases), and the upgraded version of the H100 is US$36,000 per piece. However, the reporter learned from the agent that although the official price of the Chinese version of the A800 and H800 chips is slightly lower, in fact, due to reasons such as short supply, the actual price is higher than the high-performance version, and the premium of the A800 has reached 10. More than 10,000 RMB.
In the face of huge market demand and extremely uncertain foreign solutions, Chinese AI companies certainly hope to have a “new choice”, and many Chinese chip companies hope to become a “new choice”.
”Historically, if a new chip company wants to rise and challenge the existing leading companies, it must first start with low-end chips, and gradually cultivate user habits and developer scale by occupying the low-end chips. , and then slowly develop to high-end. If you directly develop high-end chips like Nvidia A100, you need to invest a lot of manpower and money, and you have to face software and ecological moats, which is very difficult.” Wang Bo said.
Wang Bo believes that Chinese companies can also take a similar path: first, start from the low-end and gradually move towards high-end; second, first occupy some vertical industries, and first achieve a leading position in specific fields, such as medical, transportation and other fields.
”Actually, we can see that some large technology companies in China are already adopting this strategy. For example, Huawei, Ali, and Tencent all choose to make efforts in specific fields. There are also specialized companies such as Biren, Moore Thread, and Cambrian GPU companies are also doing well. It is expected that within 5 years, there may be a breakthrough in the mid-range, but it still needs to work hard step by step to break through the high-end.” He said.
But Wang Bo emphasized that the current domestic GPU market is limited by high-end chips, but not by mid- and low-end chips. “Many people think that low-end chips can be used without restrictions, but in the long run, I think this strategy is not a good thing for the development of domestic chip companies. In the long run, it will increase the destructive power of the blockade. “He said.
Wang Bo believes that, on the one hand, this will allow foreign companies to occupy the Chinese market on a large scale, bringing them huge commercial returns, allowing them to continue to maintain R&D and produce more advanced products; on the other hand, domestic users and developers will get used to For foreign systems, this means that even if a domestic company develops a chip with good performance and a software system, it will face the problem of switching systems.
”Both of these aspects will make the moat of foreign brands higher. If there is a comprehensive restriction in the future, under the crisis of chip shortage, it will accelerate the growth of local chip companies and promote the market to use more Chinese chips.” Wang Bo said .
According to Peng Hu, at present, overseas GPU companies occupy a major share of global AI chips, and domestic manufacturers are catching up quickly. From the perspective of demand, the development of the domestic AI industry has relatively mature experience in the application landing side, which has driven the rapid rise of various domestic AI chip design companies. From the perspective of supply, compared with overseas general-purpose GPUs, domestic AI chip design companies generally adopt ASIC solutions, which better matches the domestic AI market demand. Looking forward to the future, if China improves its technological level in the field of advanced chip manufacturing and masters a certain production capacity, it is believed that domestic AI chips will gain more room for growth and development.
The “Chinese Solution” for AI Computing Power Breakthrough
Although the difficulty is not small, AI chips and AI computing power are “the game of the future”, and China must have its own “Chinese plan”. Wang Bo said that he is still very confident about the future. “Chip design itself mainly relies on good ideas. Judging from the history of chip development, innovation often comes from a rebellious idea. We couldn’t even design 3G chips before, but in 5G chips, we have achieved global leadership.” He said .
In June of this year, the Chinese Academy of Sciences released the “Xiangshan” open source high-performance RISC-V processor core and the “Aolai” RISC-V native operating system. Wang Bo believes that this layout is of great significance. “Foreign chips and software are good, but if they are not open-sourced, the high price is the biggest shortcoming. Therefore, if we develop an open-source ecosystem, it may become a breakthrough for independent breakthroughs.” He said.
Bao Yungang, deputy director of the Institute of Computing Technology, Chinese Academy of Sciences, also said: “In the past, there were two models for the development of processor chips in China, namely the high-speed rail model and the Beidou model. The former is to introduce, digest, absorb, and re-innovate under the existing ecology. The latter is to build a technology system completely independently. With RISC-V, we can take the third model – 5G model. Domestic enterprises should accelerate their participation in the formulation of open standards, and at the same time independently develop a number of key core technologies for The international market is compatible with the international ecology, and we can seize the opportunities of the third wave of chips.”
Of course, the Chinese solution is already accelerating its growth. Due to the inability to use foreign technical frameworks, Huawei is an early domestic company that started on the road of self-development.
”Half of the current Chinese model is supported by Huawei’s Shengteng AI.” In July this year, Hu Houkun, Huawei’s rotating chairman, revealed to the outside world. Zhang Dixuan, President of Huawei’s Ascend Computing Business, also revealed that as of now, Ascend has certified more than 30 hardware partners and more than 1,200 software partners, and jointly incubated more than 2,500 AI scenario solutions. In China, one out of every two AI companies chooses Shengteng.
The relevant person in charge of Huawei told the reporter of “China Economic Weekly” that Huawei predicts that by 2030, human beings will enter the YB (1 billion billion gigabytes) data era, the global general computing power will increase by 10 times, and the artificial intelligence computing power will increase by 500. times.
The person in charge emphasized that the factors that affect computing power are not only chips, but also innovations in system architecture, and collaborative innovations in hardware and basic software. At present, the explosive growth rate of computing power demand has far exceeded Moore’s Law. Simply relying on the improvement of computing power brought about by the advancement of chip technology can no longer meet the demand for computing power growth. It is necessary to innovate the architecture of the computing system, including from general computing to general-purpose computing. Computing plus heterogeneous computing’s diverse computing power innovation, as well as collaborative innovation from hardware to basic software to application enablement.
”In the intelligent era of the Internet of Everything, unstructured data accounts for an increasing proportion. The processing, processing, and transmission of data such as text, pictures, voice, and video require diverse computing to match. For example, CPU Processing big data, Web and other scenarios are very suitable, but for graphics and image processing, GPU is required to match; and graphics and image recognition, intelligent search and recommendation in daily life, etc., can use NPU based on AI computing (network Processor/Embedded Neural Network Processor) to deal with it.” The person in charge said.
According to the person in charge, Huawei improves computing efficiency through architectural innovation. “For example, at the computing node level, Huawei launched a peer-to-peer architecture, which breaks through the performance bottleneck caused by traditional CPU-centric heterogeneous computing, and improves node performance by 30%; at the data center level, Huawei leverages cloud, The comprehensive advantages of computing, storage, network, and energy are equivalent to designing the AI data center as a supercomputer, making the Ascend AI cluster higher in performance and more reliable.”