|
Liquid Computing Starts Shipping LiquidIQ Servers
Published: November 1, 2006
by Timothy Prickett Morgan
Canadian server maker--there are three words you don't normally see next to each other--Liquid Computing has begun shipping its innovative Opteron-based LiquidIQ servers. And the third release of the LiquidIQ system, which is the first commercially available version of the platform, is a lot more scalable than the prior two generations were, weighing in at 960 dual-core sockets. However, the LiquidIQ machines are still based on the Rev E Opterons, not the current "Santa Rosa" Rev F Opterons that Advanced Micro Devices launched during the summer.
Liquid Computing gave a sneak peek at the Liquid IQ box, which it bills as the world's first interconnect driven server, last November, when it divulged the details of its alpha machine. The interesting thing about the LiquidIQ design is that it borrows ideas from the telecom industry, and that is no coincidence, since Liquid Computing's founders worked for Nortel, the Canadian telecom giant, as well as for the Defense Advanced Research Projects Agency (DARPA), which gave the world the Internet, among other things.
The heart of the LiquidIQ system is IQInterconnect, which is an extremely low-latency, very high bandwidth switching infrastructure that is the intellectual property that LiquidIQ has invested heavily in. This interconnect can be used to take a bunch of Opteron-based cell boards and configure them as a supercomputer cluster that uses the Message Passing Interface (MPI) protocol commonly used in Linux-X64 clusters. This is interesting. But what is really interesting is that with a few commands, some or all of the LiquidIQ machine can be converted into what we would call under normal circumstances a symmetric multiprocessing (SMP) cluster--a standalone, shared-memory server that can run a single copy of an operating system. And, just for kicks, you could actually mix and match MPI and SMP nodes in the same machine. This is the "liquid" part of the LiquidIQ name.
"When I tell my wife what we do at Liquid Computing," explains Andrew Church, who is the head of marketing at the Ottawa, Ontario, startup, "I tell her simply that we drive a truck right between Cisco Systems and IBM."
The architecture of the LiquidIQ box can support thousands of processors, but the first generation of production boxes will support 960 processor sockets--that is 17 chasses lashed together into a single image or carved up into a parallel supercomputer cluster. According to Church, the first production LiquidIQ boxes can use the dual-core Rev E Opterons and their DDR1 main memory, but that boards for the box that are based on the Rev F Opterons and their faster DDR2 main memory are being fabbed now and should be in production by the end of the year. Church says that companies that invest in Rev E boxes today will be able to build out chasses with Rev F machines and interoperate the two, thereby protecting customer investments.
The LiquidIQ R3 server used Opteron 800 series processors, and delivers 41.6 gigaflops of peak performance per computer module (what some people call a cell board) and 832 gigaflops per chassis. With a dozen chasses connected--which is the standard top-end configuration but not the ultimate limit of scalability of the box--a LiquidIQ R3 machine could deliver nearly 10 teraflops of number-crunching power. Each compute module has four processor sockets and can support up to 64 GB of main memory, and 240 of these modules can be plugged together to get that 960-socket count. A single chassis can have up to 20 compute modules. The modules have native Gigabit Ethernet copper and 10 Gigabit Ethernet fiber links to outside networks, and can be equipped with optional 2 Gbps and 4 Gbps Fibre Channel host bus adapters for storage area network connectivity. Each compute module has a peak power consumption of 700 watts.
Each chassis also has a lot of system interconnection electronics. That IQInterconnection is implemented in inter-chassis switch modules that deliver up to 16 GB/sec of bi-directional bandwidth between the compute modules in the chassis (which plugs into the Opteron's HyperTransport interconnection). The interconnect can deliver bandwidth between the compute modules at a latency of around 2.5 microseconds, according to Church. This is very small, and is one of the reasons why the supercomputer labs, telecom companies, and service providers are interested in testing out the box. Liquid Computing has also created multi-chassis switch modules for linking racks together, and I/O modules for talking to peripherals. These I/O modules can deliver 2,000 Gb/sec of aggregate bandwidth per chassis--which is a lot.
On the software front, the LiquidIQ R3 server runs Red Hat Enterprise Linux AS 4, and comes with NFS, GFS, and Lustre file systems. Any file system supported by Red Hat will work on the storage. The machine support common middleware and databases, and are especially suited to the cluster-enabled versions of Oracle 10g and IBM DB2 databases. Way down under the operating system is the IQInterconnection firmware, which virtualizes all of the resources in the system--processors, memory, I/O--and which allows the machine to be sliced and diced on the fly. You can even use Xen hypervisors to further partition processors, if you want.
Why is the LiquidIQ box different? "The national labs buy the lowest-cost flops they can get--and then they spend a fortune each year tweaking their software," explains Church. "We are far more important to the enterprises and service providers who are facing the same scalability and reliability issues as the supercomputer labs, but who do not have that kind of software budget."
That is not to say that the supercomputer labs are not interested. In fact, Lawrence Livermore National Laboratory is testing a LiquidIQ box right now, and on early tests, the machine is showing six times the sustained bandwidth of other Opteron-based machines.
The one thing that the LiquidIQ box does not have--and which it needs--is a published list price. Church is cagey about pricing, but says that any company or lab that is looking at buying 16 or more processors and linking them with 10 Gigabit Ethernet should give Liquid Computing a call.
As for future roadmaps, Church says that Liquid Computing has an port to Intel processors in its roadmap, but that this will not happen before 2008 and that it could be later. (It is logical to assume that Liquid Computing is waiting for Intel to ditch the front side bus architecture of its Xeon and Itanium processors and move to the HyperTransport-like Common System Interconnect, which was expected next year but which has been delayed.) And, Liquid Computing could look to other architectures beyond X64, too, such as Power and PowerPC. "If there was a business need, we could certainly do it," boasts Church. A LiquidIQ server using a mix of Opteron and Cell processors might be a very interesting thing, indeed.
By the end of the year, Liquid Computing will support Novell's SUSE Linux Enterprise Server 10, Microsoft's Windows Server 2003 is running in the lab now on the boxes and is in the roadmap for delivery for later this year. Church says that there is a "considerable amount" of driver work that Liquid Computing has to do to get Windows working.
FreeBSD, an open source Unix variant that is popular with some labs and large service providers could also be ported to the box, and if enough people ask for it, Sun Microsystems' Solaris Unix could also make it over. Service providers are pretty adamant about Linux, but among the 15 early trials the company has for the LiquidIQ R3 iron, two of them are asking for Solaris.
RELATED STORY
Liquid Computing Jumps into the Servers with a Big Splash
|