An Overview of Clustering

Abstract

Over the last several decades business use of computer systems has become critically dependent on business information technology resources. As a result, businesses demand that these resources are always available. Any outage has serious business implications. At the extreme, an extended system outage can cause a business to be permanently closed. One hour of system downtime can cost from tens of thousands of dollars to several million dollars, depending on the nature of the business. As a result, users require that their system services be continuously available, 24 hours a day, 365 days a year. Technology that supports increased computer system availability is critical.

The key technology that enables continuous availability is clustering. A cluster is a collection of one or more complete systems that work together to provide a single, unified computing capability. From the end user‘s perspective the cluster operates as though it is a single system. In a cluster work is distributed across multiple systems. Any single outage (planned or unplanned) in the cluster will not disrupt the services provided to the end user. End user services can be relocated from system to system within the cluster in a transparent fashion.

What is Clustering?

The Problem

The world of commerce depends upon on-line information systems to run core business operations. Likewise, the world of science and engineering depends on high-performance computing. However fast or reliable computers are at any point in time, applications invariably demand more computing resources.

Mainframes, supercomputers, and fault-tolerant systems have historically provided the highest levels of service. Unfortunately, their specialized construction and proprietary components contribute to long design cycles and high costs -- attributes unsuited for today's client/server world. Today's challenge involves not just providing the utmost performance and reliability, but doing so flexibly and inexpensively. Clustering meets this challenge.

Clusters: The Solution

A cluster is a group of systems that works collectively as a single system to provide fast, uninterrupted computing service. You can find more technical definitions, but they are essentially the same thing: close cooperation can maximize performance and minimize downtime. These goals are unremarkable. On the other hand, how clustering achieves these goals is remarkable.

The speed and reliability of traditional monolithic systems comes from intensive engineering. Each and every part is meticulously designed from scratch. Long development cycles and high costs are the prices paid for close tolerance and maximal optimizations. As the saying goes, "there must be a better way."

Clustering is the result of a fundamental re-thinking of the best way of delivering high performance and high availability. Focusing on the desired results, rather than a specific implementation, freed designers to innovate. They realized that individual systems and their components don't have to match the characteristics of mainframes, supercomputers, or fault-tolerant systems -- as long as a team of systems can cooperate to achieve the same results.

Instead of custom components, clustering combines the best off-the-shelf components into cooperative teams. Should one team member fail, another stands ready to pick up its workload. Should a player be insufficient to quickly accomplish a task, many others can pitch in. Working together, teams of off-the-shelf components approach, and in some cases surpass, the capabilities of mainframes, supercomputers, and fault-tolerant systems. Moreover, they do so at much lower costs.

Imagine several hundred clerical workers, such as reservation agents of a hotel chain. A cluster supporting them might contain several multiprocessors, each running a call-and-reservation management package. Should one system fail, only the users connected to that node will feel any serious impact. Any service interruption will be brief -- a few seconds to a few minutes -- while cluster coordination software restarts their application on a surviving system, and the users once again have access to their application. Once the underlying problem is corrected, users can be smoothly shifted back to their original system.

In this scenario, clustering has helped in several ways. Long before any problem occurred, the cluster enabled horizontal scalability - the combination of several modest-cost servers to handle a large load. Once a component failed, cluster software multiplied its contribution by first limiting the effect of a system failure, and then by accomplishing a quick recovery. The cluster, as a team, was resistant to failure conditions. Although the failure of an individual system was not avoided, its impact was minimized.

Clustering Solutions

Clustering is not just an intriguing idea, it is a workaday reality. Since Compaq pioneered the concept in the early Eighties, over 100,000 clusters have been installed. These contain over 400,000 systems solving myriad problems for millions of users. Customers' long-term acceptance proves the effectiveness of well-implemented clustering technology.

Although other vendors have finally begun to realize the value of clustering, Compaq remains the undisputed leader. Independent market analysts such as the Gartner Group use OpenVMS Clusters' capabilities as the benchmark for the industry. That leadership continues on UNIX with TruCluster Available Server and TruCluster Production Server.

In 1993, Compaq announced a roadmap for bringing its clustering expertise to Compaq UNIX. Since then Compaq has delivered the DECsafe Available Server, TruCluster, and its high-performance cluster interconnect Memory Channel.

Putting Clusters in Perspective

Evaluating systems technologies can be difficult. Symmetric multiprocessing (SMP), fault tolerance, massively parallel processing (MPP), and clustering all compete for market position based on their respective capabilities.

While each technical approach has a role, SMP and clustering stand out as mature, broadly effective technologies to which users should pay particular attention. Working together, they address virtually the entire range of customer requirements. Clustering can, and often does, include systems from each technology.

Availability

Availability is the proportion of time that a system can be used for productive work. Typically expressed as a percentage, 100% is the best possible rating. Availability is a better term than reliability because it stresses the real goal -- keeping resources, services, and applications running and available to users. Reliability, in contrast, focuses more on the attributes of individual components.

Typical stand-alone systems can achieve about 99% availability. This may sound great, but  once you realize that the missing 1% represents roughly 90 hours -- over three and a half days -- , it loses some of its luster. 99% availability is sufficient only for forgiving organizations and casual applications. Systems upon which a business depends must do much better.

On the other end of the spectrum, critical applications such as emergency call centers, telecommunications hubs, air traffic control, and medical equipment must be up and running 24 hours a day, every day of the year. Any downtime at all may risk lives, money and reputations. These situations are the province of "fault-tolerant" or "continuous processing" systems that use extensive redundancy and specialized construction in a heroic attempt to prevent service interruptions. These systems can achieve 99.999% availability or better -- that is, about five minutes downtime in an average year.

So why aren't fault tolerant systems used everywhere? Doesn't everyone want zero downtime? Sure - but there's a catch. While everyone wants to completely eliminate downtime, few applications can justify the expense. Fault-tolerant systems' specialized construction and extensive redundancy makes them cost several times that of conventional systems. Even more important, once a system has been made 99.95% or 99.99% available, all of the likely failures will be software or environmental failures, not hardware breakdowns. Spending money to make the hardware even more reliable is not very cost-effective. It should only be considered in the most availability-sensitive situations.

The systems of greatest interest to most users experience between three days and three minutes of downtime per year. If attentively managed with a supportive set of systems management tools, conventional stand-alone systems can achieve between 99.5% and 99.8% availability -- or 18 to 44 hours of downtime per year.

To go beyond into "high-availability" or "fault resilience" requires clustering. TruCluster can eliminate all but a few hours of downtime per year. What downtime it cannot eliminate, it ameliorates. Unplanned downtime is converted from a serious problem into a brief service hiccup. Most of whatever downtime remains is made into planned downtime.

Highly-available environments suit customers for whom money is an object, and who can tolerate a brief delay while service is being restored. So while an air-traffic control system may require fault-tolerance, a reservations system based on a highly available or fault-resilient cluster is more than adequate to keep agents selling tickets to satisfied customers.

Clusters Deliver Scalable Performance

In addition to high availability, clustering helps achieve high performance. The term scalability (really, an abbreviation of performance scalability) is often used to stress the goal of high overall performance. It also hints at the incremental performance growth clusters enable.

Working in parallel is one of the most direct paths to higher performance. Uniprocessors, multiprocessors, clusters, parallel processors, and distributed computing all use parallelism at a variety of levels. Emphatically, they are not all the same.

Each flavor of parallelism has advantages and limitations. Digital has no bias regarding these approaches - just a simple, practical goal of getting maximum value from each technique.

The real issue is how well application programs can make use of the parallelism each approach offers. Some programs can easily be partitioned into pieces, each of which can run on a separate processor. Such multi-threaded programs can achieve significant performance gains. The more pieces, the better the utilization of parallel components, and the bigger the performance gains. The perfect case is "linear scalability"; for N processors, the application runs N times faster than on one processor.

True linearity is rare, particularly as the number of processors grows. Many important programs cannot be extensively multi-threaded. They are somewhat partitionable, but only into a limited number of pieces, and decomposition requires significant effort - maybe even a rearchitecting.

This inherent difficulty is complicated by the need of the pieces to coordinate among themselves. Even if a program can be decomposed, communications can become a limiting factor. The overhead it imposes can quickly overcome whatever inter-processor communications facility system designers offer.

SMP has become popular because it provides a particularly effective way of scaling performance. It offers a few parallel processors, connected by a high-speed system bus, and coordinated by an attentive operating system. When well implemented as it is in DEC UNIX, this modest level of parallelism can be handled fairly easily by applications. Often, developers do not bother to parallelize individual applications; users just run many jobs on an SMP system, and the OS takes the responsibility for distributing the total workload, one job to a processor. Those programs such as database managers with a high payoff for optimization are explicitly parallelized by sophisticated developers.

Though effective with a few processors, as more processors are added, demand for the intimately shared resources grows. These demands are satisfiable at first, but they become less so. How fast the demand grows -- and thus the number of CPUs that can be effectively used -- depends on the level of inter-thread communications required by the application workload. Most workloads benefit from between two and six processors.

At some point -- generally around 12 CPUs -- the law of diminishing returns takes full effect. Contention for shared resources becomes a bottleneck. As more processors are added, total performance rises only slowly, if at all. Extending the capabilities of the system bus is not an answer because it can no longer be done cost-effectively.

Another limitation is that while multiprocessors can be quite reliable, they are not highly available. Should a processor fail, the system must be rebooted. Should some other component fail -- say, a SCSI disk controller or network adaptor -- the system cannot reboot its problems away. Multiprocessors must also be taken off-line for maintenance and upgrades.

MPP, in contrast to SMP, has not become broadly popular. It uses vast multitudes of processors (often with relatively weak individual capabilities), linked by intricate proprietary interconnects, and coordinated explicitly by application structuring. This architecture suits problems that can be decomposed into hundreds - or better yet, thousands - of pieces. Though some important tasks qualify -- forecasting the weather and large-scale text retrieval, for instance -- these are not the workaday tasks of most organizations. Parallelization for large-scale parallelism is just too hard and the performance gains too haphazard for general use. MPP also does little more than SMP to ensure system availability.

In terms of performance, clustering fits between SMP and MPP. Depending on the configuration, a cluster can appear more like an SMP or more like an MPP. The number of processors, the inter-processor interconnect, and the software used to coordinate operation are the primary differentiators. "Loosely" coupled clusters offer a large number of processors - potentially hundreds - linked by networking technology. "Firmly" coupled clusters use a smaller number of processors -- a few to some small multiple of ten -- linked by network, storage channel, or specialized cluster . ("Tight" coupling is often used to described shared memory SMP systems.)

Because clusters have looser processor-to-processor communications than SMP systems, more care must be taken to structure their workloads for scalable performance. On the other hand, the logical "distance" between processors has advantages. Because processors are isolated from one another, there is less contention for shared resources. Should one system fail, other nodes are generally unaffected. An alternate elsewhere in the cluster can generally stand in. When well-implemented as it is in an TruCluster, clustering yields substantially greater availability than either SMP or MPP technologies.

While it is instructive to compare SMP and clustering, the approaches are not exclusive. Clustering often incorporate SMP systems. Clusters can also incorporate MPP or fault tolerant systems for specialized requirements