High Availability Calms Distributor’s Fears
November 2, 2015 Dan Burger
Behind this distribution company that spans Europe and the United States is an IBM Power8 system running IBM i. The production box is a single-socket, six-core, 3.02 GHz processor configuration capable of 59,500 CPW, but able to get the job done while operating on two cores. It sits in the home offices of LF SpA headquartered in Cesena, Italy, about a four-hour drive north of Rome. The company takes its business continuity very seriously, which is why it recently implemented high availability software from Maxava.
Maxava has been in the high availability and disaster recovery business for 15-plus years. It has installations around the globe–more than 2,000 according to the company–and provides round the clock support in more than 40 countries. The company offers three versions of high availability software: Data Stream, SMB and Enterprise.
LF SpA chose the SMB variant, which replicates data, object (more than 50 types), spool files, and the IBM i integrated file system. Although available with a graphical user interface, LF embraced the traditional green-screen interface.
For the backup box, LF uses a Power 720, a six-core machine with a 34,900 CPW capacity. It has a single core activated. LF keeps it in a second on-premise data center. Both the production and backup machines are running IBM i 7.2.
Online B2B purchasing is a vital part of LF’s business, but offline purchasing continues to be an important conduit in sales activity also. The company uses a highly customized ERP system from a third-party vendor that is integrated with an automated warehouse management system.
To tackle the high availability project LF teamed with WSS Italia, one of Maxava’s certified partners in Europe and a company with skills and experience in planning, implementing, and monitoring HA projects.
In the early going, WSS Italia and LF discussions honed in on topics such as determining which applications and libraries would be managed and monitored in the HA environment, system performance analyses, automatic object recovery, journal maintenance, network requirements, HA software installation, and backup strategies.
LF considered several HA options in addition to Maxava’s SMB HA, but, according to WSS Italia, ultimately chose Maxava because of its technical features, ease of use, and its minimal impact on system performance. These are the usual requirement regardless of the software’s sophistication or cost. HA is certainly capable of covering a lot of territory in both those fields.
Not Always Easy
For dramatic effect, compare modern HA for IBM i with the HA of 10 years ago when dedicated staffs watched over extremely complex implementations at large enterprise customers–the only companies that could afford HA. Ease of use didn’t exist in those days. There were no graphical user interfaces, no GUI-based consoles that on a single screen monitored the status of key IBM i server performance metrics, displayed the status of HA replication and failover readiness.
“As soon as you utilize a GUI-based set up rather than green screen, it is far easier to have a single pane of glass view for multiple systems and multiple environments,” says Peter Kania, Maxava’s technical services and development director. “We’ve spent a lot of time developing our interfaces–GUI, green screen and the maxView line. The feedback we get is that our software allows users to spend less time looking for information, while managing and monitoring the system.
“But there are still many people who like the green screen (LF is one example); however, it is not as convenient when viewing multiple systems and it does not provide the same level of functionality as a GUI or browser-based interface. From there, we can pull in multiple feeds from multiple systems. We can easily display information from more systems.”
In 2012 Maxava introduced new capabilities in its managing and monitoring functionality called maxView, a feature set available in three levels that includes a GUI-based console that monitors the status of key IBM i server performance metrics, displays the status of HA replication and failover readiness, and the capability to actively manage Maxava HA environments.
maxView also has the capability to alert customers via browser on a smartphone, tablet or PC, color-code alert categories, set specific alert thresholds and exclusions, and set the auto refresh timing interval to user-defined settings. It also includes drill down capabilities for reviewing detailed information around replication, Maxava configurations, and other system statistics.
Ten years ago, all these features and functionality were either unavailable or incredibly complex.
Another ease of use issue available now is the capability to remotely monitor, and even remotely role swap, from any location that has internet connectivity. No longer being tied to a workstation is a big thing.
The lighter footprint, with smaller impacts on system performance is another differentiator. Maxava has done an admirable job of adding more functionality around replication and more granular auditing while limiting the performance drag on the system. Bigger and more powerful servers have negated some of the performance draining workloads to be sure.
HA has traditionally had an impact on system performance. That used to be associated with journaling and was related to the amount of I/O on a system, the number of disk arms, and the amount of data in storage. The performance enhancements in the new iron have pretty much eliminated a lot of that impact. Faster drives and journal cache are both performance enhancers.
Additional performance considerations relate to the amount of work it takes to sort out what needs to be replicated, how it will be replicated, and how it travels from the source system to the target system.
Luca Lagattolla, product manager of WSS Italia says, “What’s actually changed is that LF now can leverage the DR system and the journaling of OS/400 without involving the production system while they are working. The performance of the production system is stable and this is possible thanks to Maxava.”
The Role Swap
Testing an HA system is critical to making sure it can deliver in case of a disaster. As Kania points out, each user is going to be different. The range of complexities is almost infinite. Those with simple HA requirements can get away with five or 10 minutes of monitoring HA each day. Some will require more. Kania says “on average” most Maxava customers only spend five to 10 minutes monitoring HA each day.
“You have to have good auditing capabilities–once a week or once a month or even once a day. Run full audits across your environment to verify the system is ready for a role swap,” Kania says. “Doing regular failover role swap testing is important because what you tested six months ago may not be the same today. You may have new programs or changes in how some part of the business works.”
What should be done and what is actually done often is not within a handshake of one another though. Role swap tests take time and potentially can have impact on the production if not performed correctly. Users also fear something may go wrong and take down the production box. To make the role swap testing easier, Maxava created a simulated role swap (SRS) for the IBM i server that allows a role swap test without affecting the production box.
During a simulated role swap not all interfaces or all the users will switch from the primary to the backup server. The majority of users will continue to access the production system. A subset of users do the simulated swap. It’s not the real thing, but it’s a good simulation that adds to the trust factor.
“Our role swap commands, failover commands, and SRS commands give users a choice of what works best for them when they want to test or when they have to do the real thing,” Kania says.
“There are different levels of trust and the only absolute trust comes from doing successful role swaps on a regular basis,” Kania says. “Regular basis,” of course, is a relative term. Companies tend to find their comfort zones.
“For LF, the testing was successful. When we tested the DR system, all data was replicated and all applications ran correctly. We had no issues related to program licenses and the DR system was accessible to all users,” says Lagattolla.
“We recommend to role swap multiple times each year,” Kania says, “but that’s what we call a ‘customer configurable option.'”
LF performs one role swap test each year and is confident that it is prepared for disaster. It also claims that managing Maxava HA requires less than 5 minutes each day.