Pages

Thursday, March 23, 2017

Replication Commands That May Cause Blocking

Besides the replication snapshot agent, there are other replication commands that require a Schema Modification (SCH-M) lock for a very short duration on the article(s) of the publication, and they may cause blocking in very busy publishers.

To find out which commands use SCH-M, we can use the lock acquired extended event filtered by the object id of the article (table) in the publication database and the lock mode = SCH-M while executing the most common commands

CREATE EVENT SESSION [replication_sch_m_locks] ON SERVER 
ADD EVENT sqlserver.lock_acquired(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([mode]=(2) AND [package0].[equal_int64]([object_id],(222623836))))
ADD TARGET package0.event_file(SET filename=N'c:\temp\replication_sch_m_locks',max_file_size=(100),max_rollover_files=(10))
WITH (TRACK_CAUSALITY=ON)
GO

Below are the commands that use SCH-M lock on articles (tables)
  • Sp_addarticle @article='article'
  • Sp_changearticle @article='article '
  • Sp_dropsubscription @article='article' 
  • Sp_dropsubscription @article=’All’ 
  • Initialize with a snapshot the articles being added 
For example, to create a publication, we will use sp_addarticle, and it will require SCH-M lock on each article. If we initialize them using the snapshot agent, it will require SCH-M locks on all articles of the publication.

To add articles to an existing publication, we will use sp_addarticle, and it will require SCH-M lock only on the articles being added. If we initialize them using the snapshot agent, it will require SCH-M locks only on the articles being added.

Wednesday, March 22, 2017

Duplicate Key or Row Not Found Errors

For example, we will get this error in the distribution agent

Violation of PRIMARY KEY constraint 'PK_Sales_OrderLines'. Cannot insert duplicate key in object 'Sales.OrderLines'. The duplicate key value is (2). (Source: MSSQLServer, Error number: 2627)

Or

The row was not found at the Subscriber when applying the replicated DELETE command for Table '[Sales].[OrderLines]' with Primary Key(s): [OrderLineID] = 2 (Source: MSSQLServer, Error number: 20568)

The most common reasons for these errors I have found are: duplicate article subscription or another process modifying the data in the subscriber table.

Duplicate article subscription

To find out if it is due to duplicate article subscription, we can run in the publisher database (e.g. WideWorldImporters) the following stored procedure providing the subscriber instance name and the article name from the error message above (e.g. OrderLines)

exec sp_helpsubscription @subscriber = 'InstanceName', @article = 'OrderLines'


We will see that we are subscribing to the same article OrderLines from 2 publications BigTables and SmallTables

Thus a single insert in the table OrderLines in the publisher will be replicated twice to the subscriber generating the duplicate key error. In similar manner, a single delete in the table OrderLines in the publisher will be replicated twice to the subscriber generating the row not found error

To fix it



Another Process Modifying the Data

To find out if it is due to another process modifying the data we can create a SQL Audit on the subscriber table (e.g. OrderLines). We can filter by the account that have access or use public to show all accounts, as shown below.
USE [master]
GO

CREATE SERVER AUDIT [Audit-OrderLines]
TO FILE 
( FILEPATH = N'c:\temp\'
 ,MAXSIZE = 1024 MB
 ,MAX_FILES = 10
 ,RESERVE_DISK_SPACE = OFF
)
WITH
( QUEUE_DELAY = 1000
 ,ON_FAILURE = CONTINUE
 ,AUDIT_GUID = '146fab87-58c8-4521-b4b5-9ffb175f739a'
)
ALTER SERVER AUDIT [Audit-OrderLines] WITH (STATE = ON)
GO

USE [WideWorldImportersDSS]
GO

CREATE DATABASE AUDIT SPECIFICATION [DatabaseAuditSpecification-OrdeLines]
FOR SERVER AUDIT [Audit-OrderLines]
ADD (DELETE ON OBJECT::[Sales].[OrderLines] BY [public]),
ADD (INSERT ON OBJECT::[Sales].[OrderLines] BY [public]),
ADD (UPDATE ON OBJECT::[Sales].[OrderLines] BY [public])
WITH (STATE = ON)
GO

The SQL Audit file will show who, when, and what modified records from the table. In the example below, the first line was a trigger, the second was a stored procedure, the third was an ad-hoc SQL, and the rest of the lines were done by the replication account


To fix it, we need to stop the other processes are modifying data and then resynch the data