Tuesday, August 20, 2024
Micron Technology’s latest memory offering is a collaboration with Intel to address memory-intensive data center applications like AI and high-performance computing (HPC).
In a recent joint briefing with Intel, Micron presented its multiplexed rank dual inline memory modules (MRDIMMs), which the company said they are now sampling. Praveen Vaidyanathan, VP and GM of Micron’s Compute Products Group, said this first generation of MRDIMMs will enable customers to run increasingly demanding workloads with support for a wide capacity range from 32 GB to 256 GB in standard and tall form factors (TFF), which are suitable for high-performance 1U and 2U servers, and are compatible with Intel Xeon 6 processors.
For applications requiring more than 128 GB of memory per DIMM slot, Vaidyanathan said Micron MRDIMMs outperform current TSV RDIMMs, while the TFF modules feature an improved thermal design that reduces DRAM temperatures by up to 20 degrees Celsius at the same power and airflow to enable more efficient cooling in the data center.
Micron’s collaboration with Intel involved a great deal of design, development, testing and validation, Vaidyanathan added, and the development of the MRDIMM was driven by customer needs as they embrace Intel’s Xeon 6 processor.
Bandwidth per core has been decreasing
Over the last decade, compute performance has been delivered through rapid growth in core count to grow system bandwidth, but the challenge has been maintaining bandwidth per core and getting it to curve upward, Vaidyanathan said. The MRDIMM is Micron’s effort to reduce the decline in bandwidth per core, he added.
Compared with RDIMMs, MRDIMMs increase in effective memory bandwidth as much as 39% and latency by 40%, while offering 15% better bus efficiency. “If you want fast memory attached to a CPU, MRDIMM is going to be your choice.”
Vaidyanathan said latency and responsiveness are critical to today’s data center applications, and MRDIMM technology is a more efficient way of scaling up capacity and bandwidth. He said HPC is particularly bandwidth hungry, and while training is a big portion of what is driving growth for generative AI today, inference is going to be scaling much faster. “A lot of the subsystems are going to be very much focused on inference.”
Vaidyanathan said increased demand for capacity and bandwidth makes power and thermal design even more critical, given that data centers are increasingly conscious of power consumption and sustainability.
Matt Langman, VP and GM responsible for Xeon 6 at Intel, said the company sees an opportunity to not only address performance needs through improved capacity and bandwidth needs, but also total cost of ownership (TCO) and sustainability—so power and efficiency are just as critical as enabling workloads, he said.
And while generative AI is a key use case for the data center, Langman said, non-genAI use cases are still the largest part and growing. Regardless of application, Langman said MRDIMMs can help maximize performance and TCO for cloud deployments, including virtualized, multi-tenant environments. He said responsiveness has become a key TCO metric.
Ease of deployment is also critical, Langman added, noting that new technologies often bring with them additional work, including validation if they are to successfully scale.
RDIMMs stick to standards
In the briefing, Micron and Intel confirmed that its MRDIMM product would be in alignment with industry standards—it implements DDR5 physical and electrical standards that scales both bandwidth and capacity per core to future-proof compute systems. “You want it to be easy for customers, so it uses the very same physical and analytical interface as DDR5. It really doesn’t look any different from the system or a user perspective,” Langman said.
The Intel/Micron launch came out ahead of the JEDEC Solid State Technology Association revealing key details about its upcoming standards for DDR5 MRDIMMs, which it said would enable applications to exceed DDR5 RDIMM data rates. Other planned features in the JEDEC MRDIMM standard will include platform compatibility with RDIMM for flexible end-user bandwidth configuration, use of standard DDR5 DIMM components like DRAM, DIMM form factor and pinout, serial presence detect (SPD), power management integrated circuits (PMIC), and temperatures sensors (TS) for ease of adoption.
The JEDEC MRDIMM standard will also feature efficient I/O scaling using RCD/DB logic process capability, use the existing LRDIMM ecosystem for design and test infrastructure and include support for multi-generational scaling to DDR5-EOL.
JEDEC said there are also plans to support the tall MRDIMM form factor to offer higher bandwidth and capacity without changes to the DRAM package. By going taller, it is possible to enable twice the number of DRAM single-die packages to be mounted on the DIMM without the need for 3DS packaging.
In addition to having the support of JEDEC, it is telling that this latest collaboration between Micron and Intel has the backing of major hyperscalers, Jim Handy, principal analyst with Objective Analysis, told EE Times in an interview. “The fact that Google and Azure are behind this and probably other companies too, indicates that there are customers sitting there waiting for it to happen.”
Customer backing is what ultimately will make the difference, Handy added, noting that AMD is also onboard—which, along with Intel, comprises the bulk of the server processor market.
Avoiding Optane’s fate
This is not the first time Intel and Micron have teamed up to launch a new memory device. But unlike 3D XPoint, which was unable to profitably build a new layer in the memory/storage hierarchy, Micron’s MRDIMM technology does have the advantage of addressing pressing problems in the data center.
The challenge faced by 3D XPoint technology, which was only significantly marketed by Intel as Optane SSDs and DIMMs, was that it required a lot of changes to get the advantages of Optane, Handy said. “You needed to completely change the way that certain pieces of software worked.”
Although some major vendors like Oracle and SAP did make versions of their applications to leverage Optane, he said most vendors did not see value in making dramatic changes to their software.
With MRDIMMs, only the process and memory vendor need to make changes, Handy said, and for a memory vendor first out of the door with a new standard, it is an opportunity to make a higher margin product for a while.
The MRDIMM is similar to Intel’s collaboration with Sk Hynex on what they dubbed an MCR DIMM, which is an example of two suppliers trying to force a standard to happen, according to Handy. “You’ve got these two competing ways of doing almost the exact same thing, and one of them looks like it’s probably going to succeed better than the other one.”
By: DocMemory Copyright © 2023 CST, Inc. All Rights Reserved
|