Troubleshooting Transactional Replication In MS SQL Server


Overview

Replication in SQL Server can be defined as the technique of copying and distributing data from one database to another database and then synchronizing between databases to maintain consistency. It helps in improving performance, easy maintenance reducing conflicts when multiple users at different locations are involved and in improving availability. However, SQL Server replication has some weak points. In the article, we will be discussing about the causes of replication failure and its possible solutions that will help in troubleshooting SQL Server Replication.

SQL Server Replication

Replication is the process of copying data from one database to another database. Mirroring process involves two copies of a database at different locations while Replication maintains some part of database like table or views at user’s location and the modifications are synchronized to main server later. There are a number of advantages and disadvantages of replication in SQL Server. Replication involves some important entities for replication process that are

  • Publisher: Source database which needs to replicate the data
  • Distributor: Database for temporarily storing the replicated data
  • Subscriber: Destination database that consumes the replicated data
  • Article: Database objects involved in replication like table, views

Types of SQL Server Replication

  • Snapshot Replication: It copies data exactly as it appears at a current moment of time and does not monitor for data updates. The entire snapshot is generated when synchronization occurs and sent to subscribers. Used for less data changes, replication small data etc.
  • Transactional Replication: It is an incremental flow of data from Publisher to Subscriber. Any data changes at Publisher will be delivered to subscriber as they occur and are processed in the order they were made on Publisher. Used for server-server environments, incremental changes etc.
  • Merge Replication: It is the only bidirectional replication and subsequent data changes & schema modifications made at publisher & subscriber are tracked with triggers. Subscriber synchronizes Publisher when connected to network & exchanges all rows that have been modified since last synchronization took place.

SQL Server Replication Subscriptions

In SQL Server Replication, Subscriptions can be of two types- push and pull subscriptions.

In push subscriptions, the log reader agent on publisher scans the transaction log of database containing articles for replication to check which transactions are to be replicated and distribution agent send the transaction to distributor, which in turn will forward transactions to subscription database where they are applied.

In pull subscriptions, the subscriber through distribution agent will periodically check for any unapplied transactions and apply it to subscription database after gathering transactions. Distribution database will be updated to show transactions have been applied.

Troubleshooting SQL Server Replication issues

Most of the SQL Server Replication problems occur when the data in subscriber is not synchronized with data in publication base tables, data is not delivered to subscribers, and data in subscriber & publisher does not match etc. Some common causes of replication failure are latency, stalled agents, and failed jobs. Since the transactional replication is the most common model of replication, we will focus more on transactional replication, replication issues, and possible solutions to troubleshooting SQL Server replication problems.

The most common problem in replication is latency that means delay between publisher and subscriber to update the changes. Most of the time, it has been reported that there is high latency in replication. Latency is caused by resource conflicts, geographical distance, network traffic, and transactional load on publisher. It is necessary to monitor the latency and keep it within a certain threshold or alert if it exceeds the defined threshold. One replication tool to monitor the latency is ‘Replication Monitor’. Although it is useful as a guideline, it is not considered much effective as it affect the replication state due to its way of detecting resource conflicts. The Replication Monitor has drilldown dialogs that enable the DBA to check the number of unapplied transactions, gives the view of latency across all publications. The Replication monitor can be accessed by right-click on ‘Replication’ and select ‘Launch Replication Monitor’.

Troubleshooting SQL Server Replication

Alternatively, latency can be measured using T-SQL and by using tracer tokens. The DBA can find the delay between one part of replication process and another by inserting tokens into replication process. Using delay, DBA can create procedures that will monitor latency and auto-generate alert when problems are detected. In this method, tracer tokens are inserted & measured at publisher and it uses four stored procedures:

  • sp_posttracertoken: it posts a tracer token into replication flow.
  • sp_helptracertokens: it gives the list of active tracer tokens that are posted.
  • sp_helptracertokenhistory: it gives information about latency after token ID is given and publications as parameters.
  • sp_deletetracertokenhistory: it deletes tracer token after giving token ID and publications as parameters.

Conclusion

In the article, we have discussed the SQL Server Replication in depth with some of the common issues related to Replication. We came to know that Latency is one important cause for replication failure. It is important to keep the latency within a threshold and alert needs to be generated automatically when it exceeds the threshold. Some Replication tools for troubleshooting SQL Server Replication has been described like Replication Monitor, Log reader agent, tracer tokens that are used for calculating latency.