|
NEC Launches New Fault-Tolerant Server
Published: April 19, 2006
by Alex Woodie
With another terrifying tornado season in full swing, and an active hurricane season forecast to start in 42 days, it's not a bad time to start thinking about implementing high availability and fault tolerant redundancy for your critical Windows applications. One of the vendors offering fault-tolerant Windows servers, NEC Solutions America, this month introduced a two-way server dubbed the Express5800/320Ma that's aimed at the small to midsize business (SMB) market. It also unveiled its new Active Upgrade technology.
NEC Solutions America is one of a handful of vendors selling fault-tolerant X86 solutions for use in Windows and Linux environments. Others include Marathon Technologies, which supplies virtualization software for a customer's choice of X86 Windows servers, and Stratus, which, like NEC, sells fully integrated solutions, and which supports Linux, HP-UX, and its own VOS (virtual operating system), in addition to Windows.
Dick Csaplar, the product manager for NEC's FT server division, says NEC's fault-tolerant gear provides the highest level of availability for some of this country's most mission-critical applications, such as the baggage handling systems at JFK and Miami International airports, e-mail for the Office of Homeland Security, building security for Cape Canaveral and the General Services Administration in New York City, and other applications for the U.S. Center for Disease Control and Amtrak.
Fault-tolerant gear provides a number of benefits over clustering, which remains the most popular high availability solution for Windows servers, Csaplar says. First, applications need to modified to run in a cluster--they need to be what's called "cluster aware"--whereas Windows apps can run unchanged on fault-tolerant servers running in "lockstep." Failovers are also more cumbersome with clustering, Csaplar says, adding "It's a difficult process to set up a cluster and get it to perform correctly."
Clustering can also be more expensive, although, in all fairness, if properly configured, a cluster will make more efficient use of the available processing power than fault-tolerant gear and its resiliency-through-redundancy approach. But for the most critical applications--the ones with the least tolerance for any downtime and which require something near the holy grail of high availability--99.999 percent uptime, or about five minutes of downtime a year--fault-tolerant computing remains the best solution for Windows shops.
NEC's fault-tolerant servers can practically eliminate downtime for industry standard X86 servers running Windows or Linux, Csaplar says. "The two servers run in lockstep, and each node has all the modules of the server," he says. "The two sets of hardware are doing the exact same instruction set at the same time. Fault-detection chips on the motherboards compare the output against each other for every clock cycle, and ensure that both sets of CPUs came to same conclusion."
If hardware goes flaky, such as a memory stick fails, or a disk crashes, then processing is suspended on the failed node--or Customer Replaceable Unit (CRU) in NEC's book--and processing continues on the good CRU. "That failover is so fast, it's completely transparent to the end user," Csaplar says. "Then the server goes offline, reboots, and enters an error diagnostic routine. If it's fixed, it's brought back into lockstep with the other node." And if the hardware failure is terminal on that CRU, it can be fixed by plugging in a new component. Disks and other components in NEC's line are "hot swappable," which means they don't require special technical expertise to replace. The software that lets this little feat of fault-tolerant magic occur is called NEC ExpressCluster Self Recovery Edition (SRE), which monitors hardware and application conditions, and can also run checks on Windows itself.
The new two-way Express5800/320Ma is the first entry-level fault-tolerant system for small and midsize businesses, and is the little brother to the previously available four-way Express5800/320Lc server. In terms of hardware, the 320MA comes with two Intel Xeon processors, and customers get a choice of the single-core 3.2 GHz or 3.6 GHz processors, or the new dual-core 2.8 GHz processor. The 3.2 GHz version supports up to 8 GB of memory, while the 3.2 GHz system and 2.8 GHz dual-core version can handle 16 GB of memory.
NEC is now using the 400 GB Serial ATA (SATA) disk drives in its fault-tolerant line, and the 320Ma--the first to use these drives--can support up to three of them, with RAID striping for protection, giving it a maximum capacity of 1.2 TB. Users can also connect their 320Ma to external disk arrays from NEC or other vendors. This new server also introduces Gigabit Ethernet to NEC's fault tolerant line.
The 320Ma, which comes with a preloaded copy of Windows Server 2003 Enterprise Edition, is the first and only NEC fault-tolerant server to get the new Active Upgrade software. With Active Upgrade, a user can take one module of the system offline for maintenance or upgrades, while running the application on the other CRU. If the administrator is happy with the fix, he pushes a button and the change is automatically applied to the other CRU, and both sides are resynchronized within 30 to 100 seconds, Csaplar says. If something funky occurs during the upgrade and the admin doesn't want to make the change permanent, he can roll it back just as easily.
The 320Ma, like other members of NEC's fault-tolerant family, can use the NEC ExpressCluster SRE software for local resiliency, as well as ExpressCluster WAN, which is a completely different product than the SRE software, although it shares a similar name. ExpressCluster WAN is used to connect geographically separated NEC fault-tolerant servers for the purposes of recovering from the total loss of a data center. The software can be used to connect up to 16 Windows nodes, or 32 Linux nodes, in a shared storage cluster, and costs $18,000 above and beyond the cost of the hardware. NEC will set up ExpressCluster WAN for about $12,000, Csaplar says, "so for $30,000 you can have a disaster recovery solution."
An entry-level Express5800/320Ma outfitted with 3.2 GHz Xeon processors and minimum memory costs about $30,000, while a fully outfitted version equipped with dual-core 2.8 GHz processors would run about $50,000. The server starts shipping this month, and is available through NEC's channel, including Avnet and Team 1 Systems.
|