Micron introduces a high-density 256GB SOCAMM2 memory module designed specifically for AI servers. This module uses 64 monolithic 32GB LPDDR5x chips, delivering exceptional capacity, bandwidth, and power efficiency to meet the demands of modern AI workloads.
Boosting AI Server Memory Capacity
Large language models and inference pipelines require vast memory pools, shifting performance bottlenecks to system memory. The new SOCAMM2 module addresses this by enabling up to 2TB of LPDDR5X capacity in an eight-channel server CPU configuration using eight modules. This represents a one-third increase over previous 192GB modules, supporting larger context windows and intensive inference tasks.
“Micron’s 256GB SOCAMM2 offering enables the most power-efficient CPU-attached memory solution for both AI and HPC,” stated Raj Narasimhan, senior vice president and general manager of Micron’s Cloud Memory Business Unit. “Our continued leadership in low-power memory solutions for data center applications has uniquely positioned us to be the first to deliver a 32Gb monolithic LPDRAM die, helping drive industry adoption of more power-efficient, high-capacity system architectures.”
Superior Efficiency and Design
The module consumes about one-third the power of comparable RDIMMs and occupies one-third the physical space. This efficiency supports higher rack density in data centers, reduces thermal loads, and lowers infrastructure costs. Its modular SOCAMM2 design simplifies maintenance, works with liquid-cooled systems, and scales with growing AI model complexity.
Performance Gains in AI Workloads
In unified memory architectures, the module achieves over 2.3x faster time-to-first-token for long-context inference via key-value cache offloading. Standalone CPU workloads show more than 3x better performance per watt compared to traditional server memory.
Micron’s LPDDR5X portfolio includes components from 8GB to 64GB and SOCAMM2 modules from 48GB to 256GB. Customer samples of the 256GB module are now shipping.