IBM Shows Sustained 207 Teraflops Performance on Blue Gene/L with Qbox App
Published: July 25, 2006
by Timothy Prickett Morgan
Twice a year, the supercomputer market and the IT industry at large, gets a sense of the massive amount of performance in supercomputers thanks to the Top 500 rankings. This ranking of supers, which are based on the Linpack Fortran benchmark test, are interesting, and it is only a slight exaggeration to say that companies and careers are affected dramatically (if not made or broken) by how many systems with a particular brand and using certain components are on the list.
However, Linpack is a fairly old and unsophisticated test, and one may argue that it is an outdated test, one that doesn't really show the substantial differences in supercomputer architectures or what real-world sustained performance might be on the behemoths in the Top 500 list. The people who buy supercomputers are interested in tuning up their workloads to squeeze the most performance possible out of their clusters and other styles of machines (if they have them), and vendors are, of course, keen to help them do it. To that end, IBM has been working with Lawrence Livermore National Laboratory to gets the Blue Gene/L massively parallel Linux-Power supercomputer humming at greater efficiency on a workload called Qbox.
Tuning is wickedly important on such machines. Late last year, the Blue Gene/L supercomputer, which has 131,072 PowerPC processor cores, was rated at a theoretical peak throughput of 360 teraflops, but only delivered around 70 teraflops of performance on the Qbox application, which simulates quantum-scale physical interactions of sub-atomic particles and which is a key piece of software used by the United States Department of Energy. Specifically, Qbox simulates the behavior of metals under extreme temperature and pressure--like those inside a nuclear bomb. This particular simulation modeled the behavior of 1,000 molybdenium atoms; prior simulations could only do 50 atoms.
According to Jim Sexton, a research staff member of Lawrence Livermore, after IBM's and the lab's experts worked together, they got Qbox to run a lot more efficiently on Blue Gene/L. In fact, the efficiency was boosted from 19 percent of peak capacity up to 58 percent, with the application hitting a sustained performance of 207.3 teraflops. "Getting over 50 percent of peak performance as sustained performance is really an accomplishment," explains Sexton. IBM and LLNL rejiggered the Qbox code to run better on the dual-core PowerPC processors (which have two floating point units) inside Blue Gene/L, and they did other optimizations to improve the performance of the memory subsystems used in the super.