Availability A to Z

by Mark Rowe January 2, 2018

Does it really matter if the software applications supporting your security systems are only available for 99pc of the time? It probably doesn’t if you have installed a video surveillance system primarily to deter shoplifters, but the loss of what equates to more than 90 minutes of unplanned downtime per week will be significant if you have invested in an integrated mission critical security solution, writes Duncan Cooke, pictured, Business Development Manager, UK and Europe for Stratus Technologies.

There is no shortage of solutions available to ensure minimal disruption if a server fails or you have to recover from a cyber attack. Here is a jargon busting overview of the best of them.

Back-up and Restores

A standard x86-based server typically stores data on RAID (Redundant Arrays of Independent Disks) storage devices. The capabilities of x86 servers range from vendor to vendor and support a variety of operating systems and processors. However, a standard x86 server may have only basic backup, data-replication, and failover procedures in place, which means it would be susceptible to catastrophic server failures.  A standard server is not designed to prevent downtime or data loss. In the event of a crash, the server stops all processing and users lose access to their applications and information, so data loss is likely. Standard servers do not provide protection for data in transit, which means if the server goes down, this data is also lost. Though a standard x86 server does not come from its vendor as highly available, there is always the option to add availability software following initial deployment and installation.

High Availability

Traditional high-availability solutions which can bring a system back up quickly are typically based on server clustering: two or more servers that are running with the same configuration and are connected with cluster software to keep the application data updated on both/all servers. Servers (nodes) in a high-availability cluster communicate with each other by continually checking for a heartbeat which confirms other servers in the cluster are up and running. If a server fails, another server in the cluster, designated as the failover server, will automatically take over, ideally with minimal disruption to users.

Computers in a cluster are connected by a local area network (LAN) or a wide area network (WAN) and are managed by cluster software. Failover clusters require a storage area network (SAN) to provide the shared access to data required to enable failover capabilities. This means that dedicated shared storage or redundant connections to the corporate SAN, are also necessary.

While high-availability clusters improve availability, their effectiveness is highly dependent on the skills of specialised IT personnel. Clusters can be complex and time-consuming to deploy and they require programming, testing, and continuous administrative oversight. As a result, the total cost of ownership is often high.

It is also important to note that downtime is not eliminated with high-availability clusters. In the event of a server failure, all users who are currently connected to that server lose their connections. Therefore, data not yet written to the database is lost.

Fault tolerant

Fault-tolerant solutions are also referred to as continuous availability solutions. A fault-tolerant server provides the highest availability because it has system component redundancy with no single point of failure. This means that end users never experience an interruption in server availability because downtime is pre-empted.

Some 67pc of best-in-class organisations use fault-tolerant servers to provide high availability to at least some of their most critical applications. Fault tolerance is achieved in a server by having a second set of completely redundant hardware components in the system architecture. The server’s software automatically synchronizes the replicated components, executing all processing in lockstep so that “in flight” data is always protected. The two sets of CPUs, RAM, motherboards, and power supplies are all processing the same information at the same time. Therefore if one component fails, its companion component is already there and running, and the system keeps functioning.

Fault-tolerant servers also have built-in, fail-safe software technology that detects, isolates, and corrects system problems before they cause downtime. This means that the operating system, middleware, and application software are protected from errors. In-memory data is also constantly protected and maintained.

A fault-tolerant server is managed exactly like a standard server, making the system easy to install, use, and maintain. No software modifications or special configurations are necessary and the sophisticated back-end technology runs in the background, invisible to anyone administering the system.

In business environments where downtime needs to be minimised to the absolute minimum, ensuring you have fault tolerant systems will provide you with peace of mind that crucial data is not lost.

Newsletter

News

Products

Explore

Availability A to Z

Christmas markets

Prospects for security in 2018

Related News

Worst data offenders

October print magazine

Implications of Brexit

Newsletter

News

Products

Explore