Connecting tens, hundreds or even thousands of compute server nodes together into clusters for high performance computing (HPC) purposes always had the interconnects between nodes as one of its biggest challenges.
Providing maximum bandwidth between all nodes - both node to node and also all to all nodes at the same time - is already a big enough challenge, requiring very high speed switches or exotic 3D or 4D torus, hypercube and such topologies. But then, equally importantly - and unlike in a typical commercial datacentre - the latency between nodes is just as important, since many HPC apps have some amount of dependency between the threads running on different nodes.
And then, each interconnect must have as much of its own intelligence and protocol handling capability as possible, so as not to disturb the CPUs that, no matter how many of them there are, are always expected to run at close to 100 per cent busy in this kind of system. Finally, all of that has to be supported by the applications, where the user can't recompile the code himself, and be affordable - all of which is not easy to achieve.
There are quite a few proprietary interconnects with varying levels of performance. HPC users will be familiar with Myrinet from Myricom, which was the mainstay of cluster connections in the early part of the past decade, Qsnet from Quadrics right there in Bristol, which offered the highest bandwidth and lowest latency with native shared memory capability, thus avoiding the need for message passing, and the Dolphin - now Numascale - switchless torus connect, with some of the protocol capabilities similar to Qsnet.
However, even though they may be faster or more feature rich, these interconnects have been pushed aside by 10 Gigabit Ethernet - 10Gb Ethernet or 10GE for short - as well as Infiniband, or IB. 10GE is obviously a faster version of the existing Ethernet, preserving critical full application compatibility through the TCP/IP stack while providing higher bandwidth, but not really lowering latency.
Infiniband, originally created by Intel, has become a quasi standard supported by a group of network product vendors and provides very high bandwidth, up to 40Gbps in each direction in the QDR version. Its latency is also much lower than that of Ethernet, going even below 2 microseconds for remote message send on the fastest adapter cum switch combinations, or some 10 times better than the usual 10GE before the latter's 'acceleration'. IB has very decent application support in high performance computing these days, however its protocol stack is fattened by its envisioned need to act as a common fabric for everything from storage access to networking and clustering, which naturally increases CPU load and latency.