|
Sun Sells 2 Teraflops Cluster to Department of Energy
by Timothy Prickett Morgan
The Department of Energy announced last week that it has chosen Sun Microsystems to build a 2 teraflops clustered supercomputer for the Idaho National Engineering and Environmental Laboratory (INEEL), in Idaho Falls. Although it does not have the status of Lawrence Livermore National Labs, Sandia National Labs, Lawrence Berkeley National Labs, or Argonne National Laboratory, the INEEL now has a factor of ten times the performance that it used to have, and can compete with these bigger labs to do interesting research.
The Sun cluster will be comprised of 230 of Sun's two-way Sun Fire V20z Opteron-based servers, according to Eric Greenwade, chief architect of the new system. The INEEL is already acquainted with Sun gear and currently uses a mix of big Sun SMP boxes, some SGI servers, three Cray SV1s, and four smaller Linux clusters to run its supercomputing workloads, which consist mostly of Fortran code but also have a smattering of C and C++ code, plus some Java used to glue applications together. The supercomputer will be used for designing next-generation nuclear reactors as well as for proteomics and other biosciences research.
The INEEL was one of the early adopters of the Grid Engine grid software, and was a user of this product years before Sun bought the company that created it, in fact. Grid Engine is what allows researchers to dispatch their jobs to the mixed cluster that the INEEL currently runs, and it is Grid Engine that will bring the new Opteron cluster into the grid when it is fired up next month.
Initially, the INEEL will run Linux on the cluster, but Greenwade says that as 64-bit capability and virtual partition containers are added with the future Solaris 10 release, due in a few months, the cluster will probably be switched over to a mix of Linux and Solaris. The INEEL expects to use Grid Engine software to allow the nodes on the Opteron cluster to be configured on the fly to run either Linux or Solaris, depending on what specific applications call for. For now, the INEEL is going to use the dual Gigabit Ethernet ports in the V20z servers to create two different network backbones, one for private use inside the facility, based on the MPI protocol, and another for shared public use. Greenwade would like to be able to use a faster interconnection, such as InfiniBand, to reduce the latencies and increase the I/O bandwidth between the nodes in the cluster. Greenwade said that the INEEL's goal was to boost its capacity significantly in three to five years, but for now the improvement that the new Sun cluster provides (jobs that used to take a year to run now take only days) is enough.
In addition to the servers, the INEEL is getting 12 TB of StoreEdge 6320 disk arrays, plus the complete Sun software stack (including the Performance Suite QFS file system) and 600 hours of technical support thrown into the mix. The INEEL is leasing the cluster for 36 months for a combined cost of $1.97 million, or about a buck a megaflop. This is very inexpensive, and explains why Lintel clusters--and maybe now Solopteron clusters--are all the rage.
|