Monday, June 24, 2024
An initial promise of the Compute Express Link (CXL) protocol was to put idled, orphaned memory to good use, but as the standard evolved to its third iteration, recent product offerings have been focused on memory expansion.
SMART Modular Technologies recently unveiled its new family of CXL-enabled add-in cards (AICs), which support industry standard DDR5 DIMMs with 4-DIMM and 8-DIMM options. In a briefing with EE Times, Andy Mills, SMART Modular Technologies senior director of advanced product, said the AICs allow up to 4TB of memory to be added to servers in the data center. The company has spent the last year putting together these products with the aim of making them plug and play, he added.
SMART Modular’s 4-DIMM and 8-DIMM DDR5 AICs are available in a type 3 PCIe Gen5 full height, half length (FHHL) PCIe form factor, either accommodating four DDR5 RDIMMs with a maximum of 2TB of memory capacity when using 512GB RDIMMs, or eight DDR5 RDIMMs with a maximum of 4TB of memory capacity. The 4-DIMM AIC uses a single CXL controller implementing one x16 CXL port, while the 8-DIMM AIC uses two CXL controllers to implement two x8 ports—the result is a total bandwidth of 64GB/s for both.
SMART Modular’s AICs are built using CXL controllers to eliminate memory bandwidth bottlenecks and capacity constraints, Mills said, and aimed at enabling compute-intensive workloads like AI, machine learning (ML) and high-performance computing (HPC) uses—all of which need larger amounts of high-speed memory that outpace what current servers can accommodate.
expansion negates need for more costly CPUs
Mills said the introduction of SMART Modular’s AICs comes at a time where the company is seeing two basic needs emerging, with the near-term one being a “compute memory performance capacity gap.” He said this gap can be addressed by adding more memory to a server without having to increase the number of CPUs.
The other trend is memory disaggregation, which Mills said is an overused term. “The problem with memory disaggregation has been lack of standards. CXL helps with that, and then networking technology has improved significantly.”
He said there is a great deal of real-world testing of CXL technology. “We’re going to get more into deployments now as we ship these products.”
A key benefit of being able to drop more memory into a server is that you can defer or reduce SSD paging for systems like in-memory databases—the Non-Volatile Memory express (NVMe) protocol is not fast enough to do real-time inference, Mills said.
CXL overcomes the need to add more CPUs in a server environment, he added, which is an expensive path to adding performance. The idea with SMART Modular’s AICs is that they can be in an off-the-shelf server. “Just plug this card in and you haven’t had to re-architect the server. You’ve just suddenly added a tremendous amount of memory to it.”
In addition to the overall system cost savings, the reduction in the number of servers is appealing when there is space constraints, he said, as well as not overprovisioning compute to get additional memory.
In an interview with EE Times, Jim Handy, principal analyst at Objective Analysis, said that the most notable aspect of SMART Modular’s product rollout is that it puts the company in the position of being an early mover. “People aren’t really shipping CXL stuff yet.”
The company’s AICs do play into the core value proposition of CXL, he added, which is memory expansion and availability.
“CXL is kind of an odd bird because it started out as being something different,” Handy said. “It was the idea of using shared memory pools to get rid of what’s called stranded memory and data centers.” Servers were not using all the memory that was in them, but they had to have big memories put into them just in case a big program happened to be assigned to that server, according to Handy.
CXL pulls together all the disaggregated memory into a pool that workloads could tap into as long as they need it, he added. CXL 1.0 did not solve the pooling problem. “All it does is it allows you to put a very, very large memory into a single server.”
Memory expansion feeds hungry AI systems
Memory pooling with switch capabilities that speak to multiple servers was added later, and Handy sees memory expansion being the more valuable capability. “If AI continues down its current path, it looks like the servers that do AI are going to need to have just huge, huge, huge memories on them,” he said. “CXL will be the way to serve that up to them.”
Micron Technology is another early CXL mover, and its CXL CZ120 memory expansion module speaks to the trend toward adding more memory into a server to meet the demands of AI workloads rather than overprovision GPUs.
In a briefing with EE Times, Vijay Nain, senior director of CXL product management at Micron, said the company first introduced its CXL CZ120 memory expansion modules in August 2023, and now the module has hit a key qualification sample milestone.
He said the CZ120 has undergone substantial hardware testing for reliability, quality, and performance across CPU providers and OEMs, as well software testing for compatibility and compliance with operating system and hypervisor vendors.
Micron’s CZ120 modules come in 128 GB and 256 GB capacities in the E3.S 2T form factor, which uses a PCIe Gen 5 x8 interface. “If you have if you have eight different slots you can get up to two terabytes of capacity expansion,” Nain said.
He added that Micron’s testing has achieved 20-25% server bandwidth expansion, as well. “The big play here obviously is capacity, but also bandwidth.”
The qualification milestone means customers can take Micron’s samples, conduct a full test suite, and ship their own CXL solutions that leverage the CZ120 module. Nain said Micron is working with both server and switch vendors. “We’ve seen customers try out solutions where they just need such a massive memory footprint.” If they cannot get that footprint with direct attached memory, they are happy to have a switch where they can access more memory through the CXL modules, he said.
Being able to add memory with a CXL-enabled module has an appealing total cost of ownership story, Nain said, especially when trying to expand capacity and bandwidth to address AI and ML workloads.
“Everybody’s talking about the latest and greatest GPUs these days,” he said. “There’s an entire deployment base out there which is running on older, not so capable GPUs.” He added that Micron is trying to showcase that regardless of the GPU, as there is value in adding CXL memory to get a boost in GPU utilization and reduce the need for costly high-bandwidth memory.
Striving toward a composable memory architecture
Nain added that GPUs are often underutilized because of a memory bottleneck that can be addressed by CXL memory expansion—a feature of the protocol that seems to be getting the most interest as a stepping stone to realizing CXL’s full potential.
“The promise of CXL is really disaggregated memory or composable memory,” he said. “To get there, you have a few different building blocks that need to fit into place.”
Nain sees the current CXL 2.0 activity around memory expansion as critical for vetting and validating while working towards fully exploiting other capabilities, such as memory pooling.
“We still believe that the Holy Grail is getting to that composable memory architecture,” he said.
By: DocMemory Copyright © 2023 CST, Inc. All Rights Reserved
|