Scientific, engineering, and research facilities rely on InfiniBand fabrics because they offer the highest available bandwidth and the lowest available latency. But depending on the design of the InfiniBand HCAs, this advantage can be squandered as the number of compute nodes scales up into the hundreds or thousands. One of the main challenges in efficient scaling is how and where InfiniBand protocol is processed.

Adapter-based vs. host-based processing
There are two basic ways to handle protocol processing, and the choice can make a huge difference in overall fabric performance, particularly as a cluster scales. Some vendors rely heavily on adapter-based ('on-load) processing techniques, in which each InfiniBand host channel adapter (HCA) includes an embedded microprocessor that processes the communications protocols. Other vendors primarily use host-based processing, in which the server processes the communications protocols. In the early days of InfiniBand clusters, a typical server may have had just one or two single- or dual-core processors. With the ability to issue one instruction per second at a relatively low clock rate, these servers benefitted from having communications processing offloaded to the host channel adapter.

(Full version of this article can be obtained from HPCwire's web pages)