Cray Buys Opteron-Linux HPC Upstart OctigaBay for $115 Million
by Timothy Prickett Morgan
Cray announced last week that it would pony up just under $15 million in cash and another $100 million in stock to acquire a little-known but impressive maker of high performance computing (HPC) systems called OctigaBay. That upstart, which was founded two years ago, which has just started talking publicly about its Opteron-based departmental and divisional supercomputers, is in the finishing stages of development of the eponymous OctigaBay 12K systems.
With Cray already spearheading the development of very powerful Opteron-based systems thanks to its "Red Storm" Opteron-Linux supercomputer and its commercial "Strider" spinoffs due this year, you might think that the company already had enough expertise with Opterons and high-speed interconnect to simply crush OctigaBay before it even got a toehold in the HPC market. However, the OctigaBay design is very clever, and scales from a single twelve processor machine to one with thousands of Opterons, spanning from about 48 gigaflops to 1.5 teraflops. The Strider variants of the Red Storm architecture can only scale down to a teraflops or so, which left a big opening for OctigaBay to chase and Cray to miss. With OctigaBay, Cray now says it can address the entire $5 billion market for HPC equipment, from small OctigaBay systems, up to bigger boxes, then to smaller Strider boxes, and then on up to its Cray X1 supers and their home-grown multistreaming parallel vector processors, which can scale to over 50 teraflops.
Red Storm is supposed to be operational in mid-2004 (a bit earlier than expected), is being developed under a $90 million contract with the DOE's ASCI nuclear weapons design and monitoring program, and will result in a first phase machine that should hit either 20 teraflops or 40 teraflops of peak performance. (This machine will replace the ASCI Red massively parallel supercomputer built by Intel for Sandia, which has 9,632 333 MHz Pentium II processors and is rated at 2.4 teraflops.) In a talk with Wall Street analysts early this year, Jim Rottsolk, Cray's CEO, said that a key custom component of the Red Storm architecture (undoubtedly relating to system and memory interconnects) had been taped out and delivered to IBM Microelectronics for manufacturing in late December. He also said that as AMD delivers faster speed bumps for the Opteron processors (moving from 2.2 GHz to 2.6 GHz to 3 GHz and higher), Cray would launch kickers to the Striders that make use of these faster processors. He added that in general Cray was "processor agnostic," but that obviously with the Red Storm architecture, it was heavily dependent on the AMD Opteron and HyperTransport interconnect. The OctigaBay deal makes it an even more enthusiastic supporter of Opteron.
While the OctigaBay 12K system was boasting that it can be extended to as many as 12,000 Opteron processors in a single system, the odds now favor Cray scaling back the OctigaBay machines to give the presumably higher-end Striders a place between the OctigaBay machines in the entry and midrange HPC markets and the Cray X1 at the high-end. We say presumably since this was clearly the impression that Cray was trying to give Wall Street as it announced the deal, but the technical specifics of Red Storm and its Strider variants are unknown. But with a 1.8 GHz Opteron processor yielding 4 gigaflops each, an OctigaBay system with 12,000 processors would have an aggregate raw performance of about 48 teraflops. Such a monster machine using current 2.2 GHz Opterons could hit 58 teraflops, and with 3 GHz Opterons could hit 80 teraflops. A full-blown OctigaBay would have 1,000 shelves, 1Pbit/sec of aggregate bandwidth, and 96 TB of memory; it would probably cost between $50 million to $75 million, depending on how deeply OctigaBay wanted to discount based on the pricing it loosely set last November.
So what is clear is that the OctigaBay 12K is not an entry or midrange HPC box. What OctigaBay is, then, is a credible threat that cannot be allowed to either fall into the hands of Cray's rivals--who were apparently sniffing at the company, but not willing to buy just yet. So Cray snapped at the opportunity, and the 66 people at OctigaBay can now avoid the hard work of building a sales organization and competing against Cray's Strider line and what will probably be a slew of Opteron-based HPC systems that come onto the market.
Like all Cray machines since the 1970s, the Red Storm and X1 supers are all about memory and I/O bandwidth and designing specific computers to solve specific kinds of problems. Like Cray designs, the OctigaBay systems are not architected like normal servers, even though they use Opteron chips and other standard components. There is a lot of engineering in all three of these machines.
The base OctigaBay component is a shelf, which is a 3.5U rack-mounted chassis that has a total of 31 processors crammed inside. A dozen of the processors are AMD Opterons, which are organized as six two-way servers that can deliver 58 gigaflops of aggregate, raw computing power using 2.2 GHz chips. There are another dozen communications processors that link to the HyperTransport buses on the Opterons to provide a high-speed link to the switching fabric that connects all of the shelves in a massively parallel machine to each other. Each motherboard in the shelf has six FPGA processors that can be configured on the fly as either a compute co-processor or as a switch fabric processor to accelerate the running of jobs on the system, including vector math if necessary. The final processor is an AMD AV1000 embedded processor, which is used to run the Active Management System programs that control the machine. The base shelf has 96 GB of main memory and 8 GB/sec of system I/O bandwidth.
Like Red Storm, the OctigaBay machine uses the Opteron's HyperTransport interconnect as the basis of creating a fast, wide interconnect. Specifically, the system has what OctigaBay calls the Rapid Array Interconnect switch fabric, which can provide 1 Tbit/sec of bandwidth and very low latencies. A state of the art for parallel supercomputer using a switched architecture has a latency is somewhere around 5 to 8 microseconds. The OctigaBay 12K designers have promised a latency of around 1 microsecond in the first iteration of the machine. This is stunning, and would be very dangerous in the hands of any rival to Cray.
By the way, the Strider and OctigaBay machines have one more thing in common: They both run SuSE Linux Enterprise Server 8.0. In the case of the OctigaBay machine, it runs on a modified version of the Linux 2.4.19 kernel, which has had its CPU scheduler changed to provide a 100 nanosecond heartbeat to keep all of the Opterons in synch with each other. This scheduler, in essence, helps make the cluster behave more like a big SMP box than a bunch of servers clustered together. The combination of the new CPU scheduler for Linux and the low latencies makes the box easier to manage and perform work more efficiently.
Rottsolk said last week that the OctigaBay systems would be in beta by June 2004, with early shipments in the second half of 2004 and general availability in early 2005. Cray is taking another look at pricing for the machines, and says that for the configurations it will deliver, prices will range from under $100,000 to around $2 million. Going forward, Rottsolk said that the OctigaBay development and pre-marketing efforts would consumer around $2 million a quarter, with two-thirds going for development and one third going for marketing.