|
IBM Launches Souped Up GPFS File System
Published: October 4, 2007
by Timothy Prickett Morgan
Big server clusters require big storage clusters to keep their processors fed with data, which is why IBM created the General Parallel File System (GPFS) at its Almaden research center and introduced it many years ago as a commercial product. This week, a new version of GPFS is coming to market, and Big Blue is trying to extend the reach of GPFS from supercomputer centers to plain vanilla data centers with big jobs.
GPFS is the parallel file system that IBM has supporting the ASCI Purple supercomputer at Lawrence Livermore National Laboratory. ASCI Purple is a massive machine, with 12,208 of IBM's Power5+ cores running at 1.9 GHz and delivering a peak 92.8 teraflops of performance. On that machine, which runs AIX, IBM has been able to demonstrate that it can house over 2 petabytes of data, hold billions of files, and deliver an I/O rate of more than 130 GB/sec to a single file or to multiple files in the system. GPFS has also been extended to support the 367 teraflops Linux-based Blue Gene/L massively parallel supercomputer in the lab. GPFS itself has been supported on AIX on Power and Linux on Power as well as X86-X64 systems for many years.
This week, with GPFS 3.2, IBM has added policy-driven automation to the file system that allows the performance of file delivery to be scaled for different workloads. So, for instance, a parallel file system running of various kinds of disk arrays--with different levels of performance and economics--to deliver data at high speed for certain workloads and at slower speeds where the workload is not as time-critical or the servers do not require high-bandwidth to disk drives. The idea is to eventually get those who use the most performance to pay the most for data access. And because GPFS can be used across tape and disk, it can act like a hierarchical file system, eliminating the need to install hierarchical storage systems to stage data across various disk arrays and into tape. (IBM calls this feature Enterprise File Management in Version 3.2.)
GPFS also supports Clustered NFS, a management layer that allows administrators to more easily take care of clustered file servers. In commercial environments, GPFS is aimed at supporting server clusters with hundreds or thousands of server nodes that need to have parallel access to disk storage on the servers or on separate disk arrays to boost I/O throughput. Retail and financial services companies with giant data warehouses are being encouraged to consider using GPFS, which looks like a normal Unix-style NFS file system to servers, to house their data warehouses and data marts.
GPFS Version 3.2 runs on IBM's System p servers running AIX 5.3, the most current release. Red Hat Enterprise Linux and Novell SUSE Linux Enterprise Server are also supported.
RELATED STORIES
Sun Buys the Assets of Cluster File Systems
Sun Says File Systems Are An Important Differentiator
|