Skip to content

Microsoft Failover Clustering / Cluster Service and SQL Server 2005

June 11, 2009

Wait…failover cluster?  I’m a DBA; not a server guy!

That’s what I said when I started seeing DBA positions open up with a requirement of clustering experience.  I didn’t think too much more about it until I started reading about it in Microsoft’s 70-445 training kit on SQL 2005 Business Intelligence.  I had a few simple questions about how it worked, namely does the cluster use native disks on the server boxes or a SAN?  I asked two of my server buddies, but they didn’t really know for sure, as they have never worked with the clustering service.  Looks like another job for Supergoogleman!

Taking a step back – Tiered Distribution

Being a guy who asks ‘why’ or ‘how does it work?’ almost to the point of annoyance (Hey I’m an analyst!  Sue me!), I dug really far back into how modern infrastructure got to the point of failover clusters, etc.  Sure I have a 4 year Information Systems BS, but it really doesn’t teach you a whole lot about the ‘why’.  Additionally since the ‘why’ is not clearly stated, in my experience it takes some real world experience to actually figure out the whole ‘why’ part, and when that happens it’s like a giant lightbulb goes on and you’re saying to yourself, “It’s so simple!” I posted the background work I did on Tiered Architectures in this post.

How many different server cluster types are there?

I found at least four types, but the types we will be looking at in more detail are Load-Balanced Clusters and High-Availability Clusters (or Failover Clusters).

Load-Balanced Clusters

Load-Balanced Cluster

Load-Balanced Cluster on Application Tier

Load-Balanced Clusters are usually implemented on web and/or application tiers. This is because typically a web site or portal’s web pages rarely change during a session and most often application servers are working with static or very similar files. Any of the servers in the cluster can handle a client request, or they can all take a piece of the work.

Load-balancing is a bit outside the scope of this post, which you will see when we talk about Failover Clustering. If you would like to read more on load-balancing, check out this Microsoft best practices page.

Failover Clusters

Failover Cluster

Failover Cluster

Failover clustering is what SQL Server 2005 and up uses to greatly increase availability. Basically how a failover cluster works is a primary server handles all the database requests, while the failover server(s) sits idle. If the primary server fails, the cluster failsover to one of the failover servers.

How does it know to failover? Well in a nutshell if the primary server goes down for whatever reason, the failover server will notice a lapse in connectivity with the primary server. The failover server then checks with the witness (a witness acts as a third party watchdog for the system) to see if the witness has also lost connectivity with the primary server (hey maybe the failover is the one who’s lost its connectivity with the cluster!). If the witness too has lost connectivity with the primary server, the failover server then takes over processing requests from the application tier.

Why not just use a load-balancer, so failover servers aren’t sitting there idle?

Good question. The reason is if two instances of SQL Server are working with the same data files, you would quickly find your data corrupted and deadlocked as each instance tries to work with the physical database simultaneously.

Now we can answer the original question

Does the cluster use disks native to the server or a SAN?

The answer is either, actually.

Failover Cluster using SAN

Failover Cluster using SAN

The most common setup for SQL Server on a failover cluster is to install it using a SAN. In a 2 node cluster, you would install SQL Server on two duplicate servers, install the witness on one LUN on the SAN and the database files on another LUN. The primary, failover and witness LUN make up the cluster quorum. The quorum desides by vote which is the active server (as described above). Usually this will be the primary, so long as it is working properly and has working connections.

Failover Cluster with Witness

Failover Cluster with Witness

Another option is to use a witness server instead of a SAN. In this setup, you install SQL Server on two duplicate servers just as you would with the SAN setup, except instead of using a LUN to set up the witness, you setup the witness on a third server which is also running the same version of SQL Server. The data files are kept on each server, mirrored using log shipping. The witness does not keep any data or handle any requests by the application tier. It is there only to be a part of the quorum. In this set up, if the primary or failover server go down, the sibling server will begin to save up its logs to ship to the downed server once it comes back up.

The witness server solution is a great solution for businesses which don’t have dedicated SANs or in some cases, for whatever reason, the servers need to be located in two different facilities. You can do the same thing with 2 SANs in different locations, although this is a much more complicated setup.

Why use a failover cluster for SQL Server?

There are many benefits to installing SQL Server on a failover cluster, some of which are:

  • High-Availability – If the primary server fails, no downtime will be experienced by the users
  • Maintenance – A node can be taken down for maintenance without a lapse in service – This can be extremely helpful in a solution which must be up 24/7
  • The failover server can also be your testing environment – This has it serving dual purposes – You can test on the failover server as you normally would in any system, but the test server also now acts as a backup should the production server fail

Conclusion

I’ve learned a lot about using failover clustering with SQL Server. It sounds like a great way to increase availability while offering many other benefits as well. And now I know you can use either a SAN or the native disks to host data in a failover cluster. Whew! Quite a bit of research to answer that question!

I hope this information has been helpful.

Additional Information

Microsoft TechNet: Configuring a two-node failover cluster
Overview of Failover Clustering with Windows Server 2008
Why you should use Microsoft Cluster Service (MSCS)

Advertisements

From → Administration

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: