Meta Uses CXL for Memory Expansion to "Replace" DDR4

As DRAM prices spike and delivery lead times stretch across the industry, Meta has disclosed a memory-reuse strategy that repurposes DDR4 modules pulled from decommissioned servers rather than sending them to disposal.The approach, detailed by Meta researchers, lets the company expand server memory capacity without buying new DRAM, sidestepping what industry observers have dubbed the “RAM tax,” the elevated cost burden hyperscalers face amid tightening memory supply.

The expansion is enabled through CXL technology, which connects older DDR4 modules alongside newer DDR5 memory pools within the same machine. Rather than retiring DDR4 DIMMs when servers are decommissioned, Meta pools that capacity and makes it addressable as expanded memory attached via CXL to newer server fleets. Meta describes the result as delivering near-zero-cost memory expansion, while also cutting electronic waste and reducing the emissions footprint of its infrastructure. The timing lines up with ongoing memory supply constraints that continue to affect server deployment schedules across cloud environments globally.

Meta’s researchers were explicit about the technical hurdles that made prior CXL memory-expansion approaches impractical at hyperscale. Existing CXL implementations delivered roughly ten times lower bandwidth than local, directly attached memory. Latency on expanded memory ran about 60% higher than memory sitting directly next to the processor socket. Commercially available CXL products typically bundled the controller together with the DRAM module itself, which blocked any practical way to reuse existing DDR4 inventory at scale, you couldn’t just plug old, controller-less DIMMs into someone else’s CXL expansion card.

Those three constraints, bandwidth, latency, and the DRAM/controller coupling in commodity CXL products, are effectively why “just CXL-attach your old RAM” hadn’t already become standard practice before Meta’s work.

To address this, Meta had to design a custom ASIC coupled with a software scheduler. Vistara, an in-house CXL ASIC engineered specifically around low latency, power efficiency, and reuse of recycled memory, decoupling the controller from the DIMMs so retired DDR4 modules can be attached without being tied to a specific vendor’s paired memory. The second component is a software layer built on TPP (Transparent Page Placement), which automatically determines the right ratio of local to expanded memory for each individual workload and automates per-workload configuration, including turning off expanded memory entirely for workloads that can’t tolerate the added latency.

In terms of workloads, the company deployed this model for disaggregated ML inference with significant reduced number of servers and distributed caching, a hot topic at Meta, famous for Memcache historically. For Meta and these applications, the volume of total RAM capacity appears to be more important than the memory speed. In other words, cache misses drive this.

Meta Uses CXL for Memory Expansion to “Replace” DDR4

Leveraging internal engineering forces