Friday, June 14, 2024
The AI wave has transformed Computex, shifting the focus from traditional PCs to AI PCs and driving demand for memory and storage, particularly high-bandwidth memory (HBM). To seize this AI PC opportunity, Micron revealed at Computex 2024 that the availability of GDDR7 memory samples for the next generation of GPUs, with plans to secure about 25% of the global HBM market.
As AI extends from cloud computing to the edge applications, requiring data processing closer to the source, reducing latency, and enabling real-time decision-making, a “hybrid AI” model combining cloud, edge, and on-premises computing has emerged.
“Generative AI isn’t just at the cloud; it’s also transitioning to the devices at the edge—things that we hold in our hands every day,” said Praveen Vaidyanathan, Vice President and General Manager of the Compute Products Group at Micron. He referred to this as “hybrid AI,” a term that signifies moving AI inference from the data center to the edge. This transition requires higher performance and bit density in memory for traditional servers, PCs, mobile devices, as well as new AI PCs and AI servers.
Micron continues to develop a comprehensive memory and storage product portfolio, including HBM, DDR5, and SSDs, to support applications ranging from the cloud to PCs, smartphones, portable devices, and automotive edge devices. In cloud data centers, compute-bound workloads are increasingly becoming memory-bound, making the bandwidth offered by HBM crucial for AI servers and infrastructures. The demand for DDR5 and data center SSDs is also growing.
Following the launch of the 24GB 8-layer stacked HBM3e (connected to GPUs), Micron began producing 128GB DDR5 RDIMM memory (connected to CPUs) with high-capacity 32Gb DRAM chips in March, optimized for the capacity, bandwidth, and power consumption required by memory-intensive workloads, addressing the growing demand for GPU and CPU integration in server infrastructure.
“AI PCs are expected to develop towards diverse vertical applications,” Vaidyanathan stated, highlighting that the bottleneck for AI PCs lies in memory bandwidth. AI PCs require high-bandwidth, high-capacity, and energy-efficient memory to meet the demands of various AI applications, such as gaming and automotive, which need fast access, intensive computation, and real-time processing and decision-making.
Vaidyanathan cited IDC data, noting that next-generation AI PCs need neural processors (NPUs) with at least 45 TOPS of computing power, 136GB/s memory bandwidth, and performance capable of delivering 30+ tokens per second to handle most regular AI inference workloads.
Current generative AI tasks often require large language models (LLMs) with billions of parameters, while recent AI developments in edge devices focus on small language models (SLMs). Prasad Alluri, Micron’s Vice President and General Manager of the Client Storage Business Unit, highlighted that to rapidly load various small custom models needed by users on AI PCs and other edge devices, improve energy efficiency, and ensure data security, SSDs such as those with 3500 NVMe play a crucial role in enhancing user experience.
Developing HBM is highly complex. Micron plans to leverage vertical integration methodologies, design, product engineering, advanced packaging, and manufacturing to aggressively bring HBM to market, targeting a 20-25% market share by 2025. Additionally, the company is also working on the next generation of HBM4 products.
At Computex 2024, Micron also introduced GDDR7 graphics memory, using 1ß DRAM technology and innovative architecture to achieve speeds of up to 32Gb per second with power-optimized design. Compared to its predecessor GDDR6, GDDR7 supports 60% higher system bandwidth (1.5TB/s) and improves energy efficiency by over 50%. GDDR7 samples are now available to customers, with mass production expected in the second half of this year for use in gaming graphics cards, laptops, and other high-performance computing applications.
By: DocMemory Copyright © 2023 CST, Inc. All Rights Reserved
|