Newsletters   Subscriptions  Forums  Store   Career  Media Kit  About Us  Contact  Search   Home 
tug
Volume 2, Number 21 -- June 2, 2005

NonStop Fault Tolerant Servers Jump to Itanium


by Timothy Prickett Morgan


The NonStop line of fault tolerant servers announced this week at the HP Enterprise Forum in Copenhagen, Denmark, complete the transformation of five disparate server lines from HP and Compaq to the Itanium architecture. Hewlett-Packard has done what it said it was going to do for the most part, if you ignore the way it summarily flushed HP-UX v3 and its integration of the TruCluster clustering software and file system last year. Now, HP can get down to business and push Itanium.

The NonStop server line is a bit different from other members of the Integrity Itanium-based server family, however. You cannot just buy something called the NonStop operating system and the NonStop database and plunk it down on an Integrity server. The NonStop operating system and database has specific needs, and unlike Windows, Linux, HP-UX, and OpenVMS, the machines they run on have to be tweaked in different ways, with both hardware and software. HP's roadmaps from the Compaq merger show that it had hoped to have the NonStop software ported to Itanium and in the new systems by the end of 2004, so the delivery of the product has slipped a bit and this caused at least one major customer--the Nasdaq stock exchange--to but 500 MIPS engines in March since it could not wait for Itanium-based NonStop nodes.

In a way, the NonStop machines are the original Integrity servers. Tandem was founded in 1974 by some ex-Hewlett-Packers (notably Jim Treybig), who had an idea of how they could build a fault tolerant system with no single point of failure that could take on mainframes in terms of throughput and reliability. Basically, instead of doubling up on server hardware (that's the 2n approach) and using high availability clustering software to keep systems in synch, Tandem created the n+1 approach, which says create a clustered system that spreads the database over many nodes that share nothing and throw in an extra one that can take over if one fails. The Tandem approach makes for complex interconnection, but it also does not require companies to double up on their processing capacity. The benefit of the Tandem approach since it introduced its own SQL database is that by its very nature data is spread around the system, which means parallel queries against that data can be processed very efficiently. You don't have to parallelize a query to make it run faster--you have to do it for all queries on such a machine.

The first Tandem cluster was launched in 1976, and CitiBank bought the first one. It had 16 processor nodes that were linked together by a special bus called the Dynabus. The secret sauce was not just this hardware, but the Guardian operating system that could pass messages between nodes running the clustered database to decide which nodes would do work. Randy Meyer, director of strategy and technology for the NonStop line at HP, explains that Tandem would have chosen a Unix variant as its operating system, but that Unix had no concept of passing messages and was rather geared for uniprocessor and then SMP machines that see and share a common (rather than a distributed and clustered) memory architecture. (This is also why parallel Unix and now Linux clusters in the supercomputing area have to be equipped with the Message Passing Interface (MPI) or similar protocols so they can share information and work in concert.

The Tandem machines progressed through several generations in two decades, including faster processors (and a move to the MIPS chips in 1991), larger memories, increasing node count, and faster interconnections (including fiber optics in 1981). The NonStop SQL database--a variant of the Ingres database that is an offshoot of open source work done at the University of California at Berkeley--that was launched in 1986 made the machine more usable by businesses.

Just before Compaq acquired Tandem in 1996, Tandem was working on a new interconnection called ServerNet, which was a peer-to-peer internetworking scheme that replaced the ring structure of the Dynabus architecture. (Interestingly, ServerNet is the basis of the InfiniBand switching architecture, so if someone tells you InfiniBand is not viable for connecting servers and storage together into a fabric, Tandem proves that idea wrong.) The Tandem Integrity line of machines, which had a Unix kernel, used the MIPS processors, which first commercialized the ServerNet interconnect, and which added the idea of double redundancy for all system components and the transactions they support, was launched in 1994. The Tandem machines were truly innovative, they gave mainframe vendors like IBM a huge amount of grief in the late 1980s and early 1990s, and they eventually became the dominant platform for supporting stock exchanges, online financial networks, and key databases at telecommunications companies and retailers.


According to Martin Fink, general manager of HP's NonStop business, these fault tolerant machines continue to evolve in ways that do not just include moving from the relatively under-powered 800 MHz MIPS R16000 to the much more powerful 1.5 GHz/4 MB cache Itanium 2 chips. While this performance boost is interesting--compared to the current NonStop S series machines, customers are seeing a factor of two improvement in performance--the real interesting thing is to customers is that the bang for the buck improves by a factor of 2.5 with the new NonStop NS 16000s. Meyer says that companies were expecting the 2X performance improvement, but were expecting something more like a 1.5X price/performance improvement.

With the NS 16000s, HP has also introduced the concept of triple redundant components to take availability from five to seven nines (that's 99.99999% availability, or three seconds of downtime a year). This triple redundancy is not going to be backcast into the MIPS versions of the NonStop machines, which is another carrot aside from performance and scalability that HP wants to use to try to get NonStop customers to upgrade. Fink brags that the new Integrity machines are not only more scalable than the venerable IBM mainframe since the NonStops can support 4,080 Itanium processors in a single system image, but that for an equivalent workload, a cluster of the new NonStop engines with double redundancy and offering five nines of availability costs about $6 million over five years compared to the $14 million price tag of a cluster of zSeries 990 mainframes glued together with IBM's Parallel Sysplex interconnect.

The NS 16000 is a node in a NonStop cluster. It can have from two to 16 Itanium processors and from 4 GB to 32 GB of main memory. Each node has a minimum of 10 ServerNet I/O connections, expandable to 60, and supports Fibre Channel and Gigabit Ethernet links out to storage and other peripherals. The processor boards used in the new NonStops are based on the entry rx Series of Integrity servers that HP has been selling using its zx1 chipset. These boards have had their cache beefed up as well as the addition of special circuitry that implements the voting logic behind the triple redundancy. In a triply redundant system, like the one used in the Space Shuttle, all of the redundant parts run the same instruction code against the same data in exactly the same sequence. To finish a transaction, the three redundant computers to vote for each transaction they process. If all three machines agree, the transaction continues; if two out of three agree, the one that does not agree is taken offline and the transaction progresses until a spare is brought online. If none of the nodes agree, several people have lost their jobs.

A base NonStop 16000 with two processors, disk drives and basic interconnections costs $400,000 with the NonStop operating system and NonStop SQL database installed. A typical configuration of the prior generation of NonStops was for between 8 and 16 processors in a node, and Meyer expects that a typical Itanium node will have between 4 and 8 processors; such a node will cost around $1 million.

Meyer says HP is being realistic about how customers will migrate to the new Itanium nodes from the MIPS nodes. "Some customers are at the factory door right now because the need more performance," he explains. "Others will build out their NonStops with Itanium nodes over time as they test and verify. Still others who have floor space and heating issues will want to cut their processor counts in half by moving to Itanium." He says 90 percent of the vendors who supply third-party applications for the NonStop platform have already tested and verified those applications, and the remaining 10 percent are in the process of being qualified. There may be a few modules in a handful of applications that ran closer to the iron than Tandem, Compaq, or HP wanted, and this code could be problem--and always is for all server makers. Basically, ISVs just had to recompile their applications on the Itanium machines and they were done. Customers can, by the way, mix and match MIPS and Itanium nodes in a cluster, but they cannot mix MIPS and Itanium boards in a NonStop cabinet.

Sponsored By
ARKEIA

ENTERPRISE BACKUP SOLUTIONS

Arkeia is a leading provider of backup solutions, noted for its early and comprehensive support of the Linux operating system. Arkeia provides fast, reliable and easy-to-use backup solutions, scalable from a single server to complex heterogeneous environments.

Arkeia Network Backup -– An award-winning network backup solution providing the functionality and scalability for both SMBs and large enterprises.

Arkeia Server Backup -– A powerful single-server backup solution developed for business environments with stand-alone Linux servers.

Options include bare metal Disaster Recovery, NDMP support for NAS backup and hot backup plug-ins for Oracle, DB2, Lotus, MySQL, LDAP and MS-Exchange. More than 4000 customers worldwide rely on Arkeia for their data protection needs.

www.arkeia.com


Editor: Timothy Prickett Morgan
Contributing Editors: Dan Burger, Joe Hertvik, Kevin Vandever,
Shannon O'Donnell, Victor Rozek, Hesh Wiener, Alex Woodie
Publisher and Advertising Director: Jenny Thomas
Advertising Sales Representative: Kim Reed
Contact the Editors: To contact anyone on the IT Jungle Team
Go to our contacts page and send us a message.


THIS ISSUE
SPONSORED BY:

Micro Focus
Hewlett-Packard
Arkeia
Stalker Software
Open Systems


The Unix Guardian

BACK ISSUES

TABLE OF
CONTENTS
Sun Microsystems Buys StorageTek for $4.1 Billion

HP Delivers the Last of the PA-RISC Processors

NonStop Fault Tolerant Servers Jump to Itanium

As I See It: IT, the Early Days

But Wait, There's More


The Four Hundred
Cool Stuff: Transitive Emulates Server Platforms on Other Iron

Server Market Is Solid in Q1, Says Gartner

The ERP Life Cycle: From Birth to Death and Birth Again

Shaking IT Up: In a Crisis, A Good Manager Is an Absent Manager

The Linux Beacon
Cash Hoard Calms Novell as It Books Another Loss

Server Market Is Solid in Q1, Says Gartner

AMD Publishes Pacifica Virtualization Spec

As I See It: IT, the Early Days

The Windows Observer
IBM Launches Promised 32-Way Intel Server

ScriptLogic Launches Patch Software for Windows Servers

Stalker Software Lines Up CommuniGate Pro Updates

Server Market Is Solid in Q1, Says Gartner


Copyright © 1996-2008 Guild Companies, Inc. All Rights Reserved.
Guild Companies, Inc. (formerly Midrange Server), 50 Park Terrace East, Suite 8F, New York, NY 10034
Privacy Statement