Disaster Tolerance (DT) is a concept that extends beyond disaster recovery (DR). Traditional DR focuses on minimizing downtime then picking up the pieces and reconstructing any lost data afterwards.
DT, on the other hand, has the goal of continuing to operate despite a disaster so bad as to result in total destruction of an entire datacenter. This is made possible by placing servers and storage at each of two (or more) sites that are separated geographically by a safe distance. Essentially you have to keep the contents of that storage identical at each of the sites at all times.
Expensive? Yes. But if an hour of downtime costs you millions of dollars, or could result in loss of life, the price is worth paying.
"You will find OpenVMS in any environment that is serious about high availability, disaster tolerance, security, performance and scalability, especially when running real time or near real time applications," said Colin Butcher, an analyst at UK-based research firm XDelta limited.
An early example of the effectiveness of OpenVMS is DT came in the mid-nineties in Paris when Credit Lyonnais survived a fire at its headquarters. Its multi-site OpenVMS Cluster safely mirrored its data at a second site, while the UNIX folks reportedly had to run into the burning building to pull the most-recent backup tape cartridges containing their data from the tape drives.
The big test of Disaster-Tolerant OpenVMS clusters, however, came on 9/11. At least seven big financial services companies (including Cantor Fitzgerald and Commerzbank) avoided an IT collapse by using the OS for DT.
Take the case of Commerzbank's data center, located 100 yards from the World Trade Center. When the attack came, power was lost in the immediate area. The datacenter had its own UPS and generator so operations were able to continue. Eventually, however, dust and debris clogged the AC, and the temperature in the computer room rose to 104 degrees Fahrenheit.
Result: all the disks and most of the servers failed. Fortunately, Commerzbank had a disaster-tolerant OpenVMS cluster with another copy of data 30 miles away in Rye, NY kept synchronously up to date. Their cluster continued to operate. One OpenVMS Alphaserver system even continued to operate in the 104-degree heat, processing Treasury transactions using the disks at the Rye site.
But two data centers may not be enough. What if both get taken out? If the stakes are high enough, three data centers may be called for. One European lottery system, for example, transacts so much cash that it has implemented an OpenVMS disaster-tolerant cluster configuration designed to survive simultaneous loss of two out of three of its datacenters and still continue operating uninterrupted with zero data loss. This is done by keeping three copies of data identical at each of three separate datacenters, and using a server at a fourth site to provide a tie-breaking vote for the quorum scheme in order to have automatic, unattended failover.
According to Ken Farmer of OpenVMS.org, the OS now has 10 million users and hundreds of thousands of installations worldwide. Granted that many of those users may be hold outs from the old days, refusing to give up the tried and trusted system in favor of the newest and most hyped "high availability" system. But according to a source at HP, the company is doing a brisk $2 billion a year in VMS-related hardware and software sales. Although exact numbers are unavailable, Alpha hardware alone accounts for several hundred million dollars a year. And that number is increasing.
Why? Like a veteran Michael Jordan in his last few years at the Chicago Bulls, OpenVMS can still outperform the young bucks. The stats are impressive: 3,000 simultaneous active users; almost 2 million database transactions per minute (with Oracle); up to 96 cluster nodes (over 3,000 processors), and a full cluster capability up to 800 kilometers.
So it's no wonder that HP is investing heavily in OpenVMS (HP inherited VMS from DEC via Compaq). HP is now porting the OS to Itanium (due late this year, early next year) and seems to be positioning it within the storage landscape as a high end DT solution.
And with numbers like those, it won't be long before some of the marketing and PR types among OpenVMS' competitors are suffering from DT, alcohol induced or otherwise.