What Is InfiniBand?_
InfiniBand is a high-bandwidth, low-latency network fabric used for GPU-to-GPU communication across multiple racks in AI training clusters. InfiniBand NDR (400 Gb/s) and XDR (800 Gb/s) are the current generations. InfiniBand uses specialized cables (copper DAC for short runs, fiber AOC/transceivers for longer distances) and requires dedicated switch infrastructure.
Technical Details
InfiniBand provides RDMA (Remote Direct Memory Access) capability, which is essential for distributed AI training. RDMA allows GPUs to read and write directly to each other's memory across the network without CPU involvement, dramatically reducing latency. InfiniBand networks typically use a fat-tree topology with leaf and spine switches to provide full bisection bandwidth. NDR InfiniBand operates at 400 Gb/s per port, while the newer XDR generation reaches 800 Gb/s. Cable selection depends on distance: passive DAC up to 3m, active DAC up to 5m, and AOC or fiber with transceivers for longer runs. Cable quality and termination precision directly impact network performance.
How Leviathan Systems Works with InfiniBand
Leviathan Systems installs and tests InfiniBand network infrastructure for GPU clusters, including cable routing, switch placement, and performance validation with insertion loss and return loss testing.