Fully Buffered DIMM


Fully Buffered DIMM is a memory technology that can be used to increase reliability and density of memory systems. Conventionally, data lines from the memory controller have to be connected to data lines in every DRAM module, i.e. via multidrop buses. As the memory width increases together with the access speed, the signal degrades at the interface between the bus and the device. This limits the speed and memory density, so FB-DIMMs take a different approach to solve the problem.
240-pin DDR2 FB-DIMMs are neither mechanically nor electrically compatible with conventional 240-pin DDR2 DIMMs. As a result, those two DIMM types are notched differently to prevent using the wrong one.
As with nearly all RAM specifications, the FB-DIMM specification was published by JEDEC.

Technology

Fully buffered DIMM architecture introduces an advanced memory buffer between the memory controller and the memory module. Unlike the parallel bus architecture of traditional DRAMs, an FB-DIMM has a serial interface between the memory controller and the AMB. This enables an increase to the width of the memory without increasing the pin count of the memory controller beyond a feasible level. With this architecture, the memory controller does not write to the memory module directly; rather it is done via the AMB. AMB can thus compensate for signal deterioration by buffering and resending the signal.
The AMB can also offer error correction, without imposing any additional overhead on the processor or the system's memory controller. It can also use the Bit Lane Failover Correction feature to identify bad data paths and remove them from operation, which dramatically reduces command/address errors. Also, since reads and writes are buffered, they can be done in parallel by the memory controller. This allows simpler interconnects, and hardware-agnostic memory controller chips that can be used interchangeably.
The downsides to this approach are; it introduces latency to the memory request, it requires additional power consumption for the buffer chips, and current implementations create a memory write bus significantly narrower than the memory read bus. This means workloads that use many writes will be significantly slowed. However, this slowdown is nowhere near as bad as not having enough memory capacity to avoid using significant amounts of virtual memory, so workloads that use extreme amounts of memory in irregular patterns might be helped by using fully buffered DIMMs.

Protocol

The JEDEC standard defines the protocol, and JESD82-20 defines the AMB interface to DDR2 memory. The protocol is more generally described in many other places.
The FB-DIMM channel consists of 14 "northbound" bit lanes carrying data from memory to the processor and 10 "southbound" bit lanes carrying commands and data from the processor to memory. Each bit is carried over a differential pair, clocked at 12 times the basic memory clock rate, 6 times the double-pumped data rate. E.g. for DDR2-667 DRAM chips, the channel would operate at 4000 MHz. Every 12 cycles constitute one frame, 168 bits northbound and 120 bits southbound.
One northbound frame carries 144 data bits, the amount of data produced by a 72-bit wide DDR SDRAM array in that time, and 24 bits of CRC for error detection. There is no header information, although unused frames include a deliberately invalid CRC.
One southbound frame carries 98 payload bits and 22 CRC bits. Two payload bits are a frame type, and 24 bits are a command. The remaining 72 bits may be either, 72 bits of write data, two more 24-bit commands, or one more command plus 36 bits of data to be written to an AMB control register.
The commands correspond to standard DRAM access cycles, such as row select, precharge, and refresh commands. Read and write commands include only column addresses. All commands include a 3-bit FB-DIMM address, allowing up to 8 FB-DIMM modules on a channel.
Because write data is supplied more slowly than DDR memory expects it, writes are buffered in the AMB until they can be written in a burst. Write commands are not directly linked to the write data; instead, each AMB has a write data FIFO that is filled by four consecutive write data frames, and is emptied by a write command.
Both northbound and southbound links can operate at full speed with one bit line disabled, by discarding 12 bits of CRC information per frame.
Note that the bandwidth of an FB-DIMM channel is equal to the peak read bandwidth of a DDR memory channel, plus half of the peak write bandwidth of a DDR memory channel. The only overhead is the need for a channel sync frame every 32 to 42 frames.

Implementations

Intel has adopted the technology for their Xeon 5000/5100 series and beyond, which they consider "a long-term strategic direction for servers".
Sun Microsystems used FB-DIMMs for the Niagara II server processor.
Intel's enthusiast system platform Skulltrail uses FB-DIMMs for their dual CPU socket, multi-GPU system.
FB-DIMMS have 240 pins and are the same total length as other DDR DIMMs but differ by having indents on both ends within the slot.
The cost of FB-DIMM memory was initially much higher than registered DIMM, which may be one of the factors behind its current level of acceptance. Also, the AMB chip dissipates considerable heat, leading to additional cooling problems. Although strenuous efforts were made to minimize delay in the AMB, there is some noticeable cost in memory access latency.

History

As of September 2006, AMD has taken FB-DIMM off their roadmap. In December 2006, AMD has revealed in one of the slides that microprocessors based on the new K10 microarchitecture has the support for FB-DIMM "when appropriate". In addition, AMD also developed Socket G3 Memory Extender, which uses a single buffer for every 4 modules instead of one for each, to be used by Opteron-based systems in 2009.
At the 2007 Intel Developer Forum, it was revealed that major memory manufacturers have no plans to extend FB-DIMM to support DDR3 SDRAM. Instead, only registered DIMM for DDR3 SDRAM had been demonstrated.
In 2007, Intel demonstrated FB-DIMM with shorter latencies, CL5 and CL3, showing improvement in latencies.
On August 5, 2008, Elpida Memory announced that it would mass-produce the world's first FB-DIMM at 16 Gigabyte capacity, as from Q4 2008, however the product has not appeared and the press release has been deleted from Elpida's site.