How Storage Accelerates AI with DeepSeek

Overview

From AI servers to AI PCs, rapid adoption of DeepSeek has become a key topic. Whether it is DeepSeek Janus-Pro pushing multimodal capabilities to a new level, the mainstream DeepSeek-V3 variants, or on-device DeepSeek-V3 deployments, these models introduce new storage requirements. For example, the full undistilled DeepSeek R1 is a 671 billion-parameter mixture-of-experts model; the unquantized model file can be as large as 720 GB, while dynamic-quantized versions range from about 150 GB to 400 GB.

DeepSeek significantly increases GPU utilization, allowing system designers to reallocate resources toward parallel compute and storage subsystems. Conventional storage approaches struggle to meet both capacity and efficiency demands. In enterprise and data center SSDs, newer technologies such as QLC flash and CXL are being used to lower cost and improve efficiency for AI workloads.

QLC and CXL: Foundations for AI Storage

DeepSeek reduces some compute costs while enabling greater multimodal and reasoning capabilities, which in turn encourages use of larger datasets. Much cold data becomes warm data, increasing requirements for faster transfer rates than HDD and driving annual data growth on the exabyte scale. This pushes SSD requirements toward higher capacity and lower cost.

Quad-level cell (QLC) flash increases capacity per unit area. Kioxia's eighth-generation BiCS FLASH 2 Tb QLC achieves about 2.3 times the bit density of Kioxia's fifth-generation BiCS FLASH QLC products and improves write energy efficiency by about 70%. The new QLC architecture can stack 16 chips within a single package, enabling a 4 TB capacity per package and using a compact package measuring 11.5 x 13.5 mm with a height of 1.5 mm.

As a result, future storage products based on eighth-generation BiCS FLASH QLC could enable enterprise and data center SSD capacities exceeding 120 TB.

Intensive DeepSeek workloads also place heavy demands on memory. Storing hundreds of gigabytes of model state in DRAM is costly, so reducing overall system cost using BiCS FLASH-based approaches is under consideration. For example, storage-class memory (SCM) concepts such as XL-FLASH, which combine phase-change memory principles with BiCS FLASH structures and CXL interconnects, aim to provide higher bit density and lower power than DRAM while offering faster reads than NAND flash. These approaches can improve utilization and energy efficiency.

How Storage Accelerates AI with DeepSeek

Overview

QLC and CXL: Foundations for AI Storage

Understanding Antenna Multiplexer Technology

UWB vs Other Positioning Standards

Why S-Parameters Dominate RF Engineering

5G CA Band Combinations and Bandwidth Rules

Principles of Moving Target Display Radar

RF Communication: Radio Wave Propagation