What foundational devices do AI chips require?

The eighth Future Chip Forum was held in Beijing. The forum, organized by Tsinghua University’s Institute of Integrated Circuits and the Beijing Advanced Innovation Center for Integrated Circuits, provides a multidisciplinary exchange platform for universities, research institutes, and industry. Professor Wei Shaojun of Tsinghua University, a member of the International Eurasian Academy of Sciences, outlined several perspectives on future chips.

Characteristics required for ultra-high compute chip architectures

High-performance computing has entered the exascale era. Last year, the U.S. Department of Energy announced the world's first exascale supercomputer. Exascale computing is an important milestone capable of performing 10^18 double-precision floating-point operations per second. Following that announcement, U.S. advisory bodies set zettascale as the next goal. Compared with exascale, zettascale would be roughly 1,000 times faster.

The industry continues to pursue higher computation speed while total data volume keeps growing. Estimates projected total data to reach 100 ZB by 2024. With the growth driven further by artificial intelligence, zettascale computing has become a necessary challenge to address. Wei noted that simply relying on process-node scaling makes achieving much higher performance nearly impossible. For example, an existing machine using a 6 nm process consumes about 21.1 MW and occupies 680 m^2. If one extrapolates to 3 nm for a zettascale system, estimated power could reach 8,000 MW, which equates to around 8 million kWh in one hour and roughly RMB 4 million in energy cost for that hour. The capital cost of achieving zettascale at such a process node could be on the order of hundreds of billions of dollars.

Although process improvements reduce per-unit cost, overall system cost and footprint remain enormous, producing significant latency. Current compute chips dedicate a very small portion of silicon to arithmetic resources, often less than 0.1%, with utilization even lower, while energy for data movement is very high. For example, GPU energy consumption for data movement can exceed 90%. Given these characteristics, continuing with current computing architectures and chip designs makes next-generation performance targets extremely difficult to reach. The three basic elements that AI depends on are algorithm, data, and compute, with compute playing a driving role.

Present AI implementations are still far from ideal. Algorithms differ fundamentally from human recognition and must adapt to diverse applications. Current implementations are computationally intensive. For instance, a simple model from 2014 required 19.6 billion operations per second and handled 138 million parameters simultaneously, placing great stress on both compute and dense storage. Wei observed that the field has entered a "golden age" for computing architecture innovation. Continuing traditional incremental approaches will not suffice.

Future supercomputing support must be achievable within reasonable investment and power envelopes—Wei suggested targets such as investments below RMB 1 billion, power under 100 MW, and footprints of tens of thousands of square meters or less. Under these constraints, new demands arise for chips, hardware, and software.

What foundational devices do AI chips need?

Intelligence extends our cognition. The ongoing information revolution—driven by computing, networking, communications, optoelectronics, and integrated circuits—has amplified human sensing capabilities. Information technology together with AI and novel materials engineering will extend and amplify human cognitive ability further. Mechanization, electrification, and automation powered earlier industrial revolutions and extended human physical capabilities. Since the first electronic computer in 1946, there have been multiple waves of intelligent systems development.

Early milestones include the so-called fifth-generation computing efforts in the 1990s and the adoption of machine learning algorithms in classification and recognition from around 2017 onward. Modern deep-learning systems like DeepMind demonstrate performance in some tasks far exceeding human experts. This progress follows insights from neuroscience that inspired brain-like computing and deep neural networks. Although training deep neural networks can be brute-force, it has produced strong practical results.

Mainstream AI chip architectures have evolved from early prototypes to more advanced generations, and deployment has shifted from cloud to edge. Wei noted that compute is a sufficient condition for AI development and is realized through chips; hence specialized AI chips emerged. A single application corresponds to an algorithm; N applications can require N different chips. To support multiple applications on one chip, reconfigurability and flexibility became important. Improving raw compute and generality has driven current AI advances.

Industry exploration now focuses on computer architecture and may require new technological approaches. Practical silicon-based devices should support massive input/output, basic weighted-sum and activation operations, and ideally integrate memory and compute with ultra-low latency, very low power, and low cost. Compatibility with standard CMOS manufacturing and potential for 3D integration are important considerations. Achieving such device innovations could open new directions for AI hardware.

Does large model proliferation decisively impact chip development?

Wei also considered whether large language and foundation models are indispensable for chip design or whether they might have negative effects. He ran an experiment querying ChatGPT multiple times and found inconsistent answers between versions 3 and 4, with GPT-4 producing a more logically coherent narrative in that instance. However, ChatGPT's apparent novelty arises from large-scale data aggregation rather than intrinsic creativity. Its generation capability is limited and largely reflects collected human knowledge; logical consistency in dialogue and translation can still be imperfect.

Regarding chip design, many expect electronic design automation to benefit strongly from large models. Wei suggested two potential EDA-related uses: integrating large models with EDA tools, and combining large models with design services. While large models can assist, fully automating difficult design tasks remains challenging. Thus, large models will likely be helpful in chip design, but the extent of that help warrants careful examination.

Where is 3D integration technology headed?

Three-dimensional integration has gained renewed interest. Moore's law increased integration density over successive process nodes. At 5 nm, roughly 110 million transistors per mm^2 can be integrated. Device evolution has moved from 45 nm and 32 nm to 5 nm and 3 nm, with transitions through high-k and FinFET and now to gate-all-around (GAA) transistors. GAA may only serve for one or two generations; future scaling beyond that is uncertain. Proposed alternatives such as 2D materials or molecular devices face feasibility and cost challenges.

Chiplets and 3D packaging represent broader integration strategies beyond monolithic single-chip scaling. Over the longer term, these approaches are viable and offer advantages such as lower cost and reduced development-to-market time by avoiding the need to use the most advanced process for all components. After 3 nm, vertical transistor placement and stacked architectures become feasible. 3D NAND already demonstrates high stacking using vertical structures.

If new fusion or integration approaches emerge that address both computation and storage, they could significantly advance 3D integrated circuits. In summary, compute is ubiquitous. High-performance computing remains a strategic priority and a focus of international competition. Current architectures and integrated circuit technologies struggle to support zettascale performance; architectural innovation is urgently needed. AI development depends on progress in chip technology, and breaking through on foundational devices is necessary. The arrival of large models has widened thinking about potential directions and generated expectations for assisting chip design, but their concrete impact remains to be determined. The essence of integrated circuits is integration, and it is time to explore practical paths and methods for 3D integration.