MSDTC changes in WS2008, Part 2: Cluster requirements

As a preamble, I have to point out that most customers have little to no trouble with MSDTC in a cluster.  It has been a great solution for those that need transactional consistency and high availability.  Nothing in this description should suggest otherwise.

 

However, one fact of life is that Windows shows the power of large numbers.  We have vastly more users than I've encountered in any previous company.  Because of that, you can see trends and identify opportunities for improvement much more easily than I've seen in the past.

 

The clustering changes came about as a response to a number of issues that we'd been seeing with MSDTC in a cluster.  They basically fell into three categories:

 

  • Complexity of setting up a cluster.  Setting up a cluster with MSDTC was complex -- it involved a number of steps, those steps had a fragility in how they related to setting up other cluster resources, such as SQL Server, and they required that the system actually degrade in capability as an initial step.

    To put it another way, setting up a cluster first disabled the MSDTC instances you did have, then you had to know to create a cluster MSDTC instance, and then you could add in new transaction-aware cluster resources (e.g. SQL Server).  The fewer steps and the fewer ordering dependencies that we could get to, the better.
     
  • Asymmetric performance patterns.  Once the cluster was correctly set up, there was a second surprise.  As the cluster lived through node failures, and depending on the application design, the performance of e.g. SQL Server or MSMQ could change significantly from one failover point to the next.

    We all know that full distributed transactions have overhead.  In some cases, quite considerable overhead.  That's why so much of the industry's work is and has been around optimizations.  In the case of MSDTC, the performance when MSDTC is on the same node as the application and / or resource managers is higher than when it is on its own node.

    Given that there is one MSDTC cluster instance, any reasonably complex configuration will have SQL or MSMQ instances that move from being remote to that MSDTC instance to being on the same system as MSDTC, and then back.  As that happens the performance profile for their interactions with MSDTC change significantly.

    This means that you should do your performance planning against the case when MSDTC is located on another node from the application and resource managers.
     
  • Migration of nodes into and out of a cluster.  Note above that the local MSDTC is disabled upon entry into a cluster, and the MSDTC cluster instance replaces the role of that local MSDTC.  This has a direct implication that when a node joins a cluster it needs to be fully recovered, and the same is true when it permanently leaves.

    Frequently this isn't a problem when joining, if you assume that nodes that are put into a cluster are pretty much idle nodes anyway.  It is only in more dynamic load management cases that this one could be a concern.

 

 

At the root of each of these points was a longstanding design decision to have only one active MSDTC instance in a cluster -- and it had to be a cluster resource.  This was the cluster representation of the assumption that topologies were pretty much fixed: that a resource manager would be in a configuration where they were always using the same transaction manager, and likewise for the applications.

 

And to be clear, this wasn't a bad assumption for how systems have actually worked.  It reflected how systems were configured, how COM+ was configured, and how the hardware tended to be deployed.

 

Beginning around .Net 2.0, though, we began to explore ways to remove or reduce this assumption.  This was driven by an assumption that we'll be seeing increased mobility of applications, and a desire to be able to isolate one application, or one resource, from another.

 

First, we had to deal with the assumption as present in the MSDTC proxy.  It would bind to the MSDTC instance specified by the first call in that process.  Until the last few years, you've not been able to find out which one that was.  Now you can for any version that supports the ITmNodeName interface.

 

Next, in Vista we changed the proxy further to bind to multiple MSDTC instances, once per unique MSDTC name specified.  With this in place, you could change MSDTC instances on a per transaction basis, both in normal operation and during recovery.

 

This was a sufficient basis to then begin working through the various design issues to handle multiple MSDTC instances in a cluster.  Interestingly, most came from configuration choices or management operations.  On the other hand, the protocol had sufficiently abstracted the idea of an MSDTC instance that I can't remember any protocol changes that were needed specifically for this, although I'll admit that I've not checked recently.

 

The design issues that I remember being at the top were:

 

  • What is a 'good' MSDTC instance to pick?  If I take an application or resource running today, it won't specify which MSDTC instance to use.  I don't want to just use a cluster-wide default, or I'll get largely what I get today -- one cluster instance handling everything.  So, how should a particular instance be picked?
     
  • What does the cluster default MSDTC instance mean?  Is it anything more than an instance that happens to have the same name as the cluster alias, or is it deeper than that?
     
  • What configuration needs to be consistent across all MSDTC instances, and what should be specific to a given instance?  Also, much more mundanely, where should all this configuration data go?
     
  • What are the management changes to support these configurations?  What are the command line changes, and what are the MMC plugin changes?
     

I'll pick up with how we approached these questions in part 3.

 

Jim.


Posted Mar 16 2008, 09:24 AM by jim-johnson

Comments

Vijay Srinivasan wrote re: MSDTC changes in WS2008, Part 2: Cluster requirements
on 03-18-2008 1:34 AM
Jim,
Very very interesting and thought provoking observations on how MS-DTC clustering works and how it scales well when RM, MSDTC-Cluster instance is hosted on the same node . . .

while I have not practically worked on MS-DTC clustering to get the first hand "feel" of the scalability, i was wondering what were your experiences when MS-DTC cluster instance had to hand shake with other TM (like XA) (root of the commit tree still being MS-DTC) either thru Host Integeration Server or by other native means. Waiting to hear and learn from your experiences. Thanks -
Jim Johnson wrote re: MSDTC changes in WS2008, Part 2: Cluster requirements
on 03-19-2008 8:44 AM
Vijay,

Thanks for your comment. I'm glad that you're finding the entries interesting. Fyi, I should be posting part 3 this weekend.

I've not personally used HIS directly, but I have worked with tests that either drive and XA resource, or are driven by an XA TM (there might be an interesting article or two in how those configurations work…)

While there is some additional logic around translating back and forth, the overall topology is pretty much the same as any other distributed transaction. I've not measured it, so I can't say that the performance is the same. All I can say is that it doesn't add any new remote connections, which is what I was trying to point out around the cluster behavior.

Yes, this would be a basis for an interesting article. I'll put it on the list :)

Thanks,
Jim.
自由、创新、研究、探索…… wrote Windows Server 2008的MSDTC改进
on 06-29-2008 7:15 AM

事务处理作为企业级开发必备的基础设施,WindowServer2008在事务处理上的改进也是很大的,无论是开发还是配置管理方面都得到了极大的改进。有几篇关于WindowsServer2008的...

Add a Comment

(required)  
(optional)
(required)  
Remember Me?