Pages

Monday, January 15, 2018

Distribution Cleanup Job Blocking

The distribution cleanup job may block the log reader agent for several minutes, if there is a large number of rows to keep and "delete TOP N" deletes N-1 rows.

The Delete TOP N Behavior

When there are N or N+1 rows, delete TOP N stops scanning the range once N rows are reached
When there are N-1 rows, delete TOP N will scan the entire range. If the range contains a very large amount rows then the scan may take long time, and the locks will be held until the scan finishes, which can cause blocking.

For example, the distribution cleanup job calls the delete statements in a "WHILE 1 =1" loop from the following stored procedures

-- sp_MSdelete_publisherdb_trans line 202:
delete TOP(2000) MSrepl_commands WITH (PAGLOCK) from MSrepl_commands with (INDEX(ucMSrepl_commands))

-- sp_MSdelete_dodelete line 38:
delete TOP(5000) MSrepl_transactions WITH (PAGLOCK) from MSrepl_transactions with (INDEX(ucMSrepl_transactions))

The following extended event session can be used to get the query plans for line 202 and 38 of the respective stored procedures

CREATE EVENT SESSION [distribution_cleanup] ON SERVER 
ADD EVENT sqlserver.query_post_execution_showplan(
    ACTION(sqlserver.query_plan_hash,sqlserver.tsql_stack)
    WHERE (([object_name]=N'sp_MSdelete_publisherdb_trans' 
 AND [sqlserver].[like_i_sql_unicode_string]([sqlserver].[tsql_stack],N'%frame level=''1''%line=''202''%frame level=''2''%') 
 OR ([object_name]=N'sp_MSdelete_dodelete'
 AND [sqlserver].[like_i_sql_unicode_string]([sqlserver].[tsql_stack],N'%frame level=''1''%line=''38''%frame level=''2''%'))))),
ADD EVENT sqlserver.sp_statement_completed(SET collect_object_name=(1),collect_statement=(1)
    ACTION(sqlserver.query_plan_hash,sqlserver.tsql_stack)
    WHERE (([object_name]=N'sp_MSdelete_publisherdb_trans' 
 AND [sqlserver].[like_i_sql_unicode_string]([sqlserver].[tsql_stack],N'%frame level=''1''%line=''202''%frame level=''2''%') 
 OR ([object_name]=N'sp_MSdelete_dodelete'
 AND [sqlserver].[like_i_sql_unicode_string]([sqlserver].[tsql_stack],N'%frame level=''1''%line=''38''%frame level=''2''%')))))
ADD TARGET package0.event_file(SET filename=N'c:\temp\distribution_cleanup.xel')
WITH (TRACK_CAUSALITY=ON)
GO

In this particular case, the table MSrepl_commands had about 600 million rows, each iteration deletes 2,000 rows with a few milliseconds of duration, but the last iteration deleted 1,352 rows with a 19 minute duration


The following query can be used to extract the query plan from the .xel file, so we can just use SSMS to visualize it

SELECT event_data = CONVERT(XML, event_data)
INTO #tmp
FROM sys.fn_xe_file_target_read_file( N'C:\temp\distribution_cleanup_0_131593922293630000*.xel', NULL, NULL, NULL);

SELECT 
t.event_data.value('(/event/@timestamp)[1]', 'datetime2') AS event_time
,t.event_data.value('(/event/data[@name="duration"]/value)[1]', 'bigint') AS duration
,t.event_data.value('(/event/data[@name="object_name"]/value)[1]', 'varchar(255)') AS object_name
,actual_plan = x.xml_fragment.query('.')
FROM #tmp AS t
CROSS APPLY t.event_data.nodes(N'/event/data[@name="showplan_xml"]/value/*')
AS x(xml_fragment)
order by event_time 


Comparing the query plan when it deletes 2,000 rows vs. the plan that deletes 1.356 rows, the former reads 2,000 rows and the latter reads more than 200 million rows due to the residual predicate


This explains why the lock is held for a long time, and the blocking is explained by the cluster index definition shown below

The Cluster Index

Both tables MSrepl_commands and MSrepl_transaction have a similar cluster index that has the publisher_database_id as the leading key column. Thus, rows are stored in order by publisher_database_id. An example for MSrepl_transactions is shown below


If there is more than one publisher_database_id then the last page of the first publisher_database_id range will contain not only the newest rows for the first publisher_database_id but also the oldest rows for the second publisher_database_id. So, the delete with (PAGLOCK) of the oldest rows for the second publisher_database_id will block the insert of new rows by the log reader for the first publisher_database_id .

Example

In my test environment two publisher databases pub1 and pub2, have publisher_database_id = 3 and publisher_database_id = 4 respectively.

-- create databases
use master
create database pub1
create database pub2
go

-- create tables
use pub1
create table test1(a int not null identity, b datetime, c varchar(255)
constraint test1_pk primary key (a)
)
use pub2
create table test2(a int not null identity, b datetime, c varchar(255)
constraint test2_pk primary key (a)
)
GO

-- Adding the transactional publication with immediate synch
use [pub1]
exec sp_replicationdboption @dbname = N'pub1', @optname = N'publish', @value = N'true'
exec sp_addpublication @publication = N'test_pub1', @allow_anonymous = N'true', @immediate_sync = N'true', @independent_agent = N'true', @status = N'active'
exec sp_addpublication_snapshot @publication = N'test_pub1', @frequency_type = 1
exec sp_addarticle @publication = N'test_pub1', @article = N'test1', @source_owner = N'dbo', @source_object = N'test1'
GO

-- Adding the transactional publication with immediate synch
use [pub2]
exec sp_replicationdboption @dbname = N'pub2', @optname = N'publish', @value = N'true'
exec sp_addpublication @publication = N'test_pub2', @allow_anonymous = N'true', @immediate_sync = N'true', @independent_agent = N'true', @status = N'active'
exec sp_addpublication_snapshot @publication = N'test_pub2', @frequency_type = 1
exec sp_addarticle @publication = N'test_pub2', @article = N'test2', @source_owner = N'dbo', @source_object = N'test2'
GO

-- disable distribution cleanup job
exec msdb..sp_update_job @job_name = 'Distribution clean up: distribution', @enabled = 0

-- add 5999 rows at 2:43 PM
use pub2
insert test2(b,c)
values (getdate(),replicate('x',200))
go 5999

-- add 21000 rows at 3:44 PM (one hour later)
use pub1
insert test1(b,c)
values (getdate(),replicate('x',200))
go 1000

use pub2
insert test2(b,c)
values (getdate(),replicate('x',200))
go 20000

-- verify there is 26999 rows
use distribution
select count(*) from MSrepl_transactions
select count(*) from MSrepl_commands

-- get root page
select allocated_page_file_id, allocated_page_page_id, next_page_page_id, previous_page_page_id, page_level 
from sys.dm_db_database_page_allocations(db_id(),object_id('MSrepl_transactions'),1,null,'DETAILED') 
where page_type_desc is not null and page_type_desc = 'INDEX_PAGE'


DBCC TRACEON(3604)
DBCC PAGE (distribution, 1, 72305, 3);
DBCC TRACEOFF(3604)



The last page for the publisher_database_id = 3 range is 72702 and the first page for the publisher_database_id = 4 range is 72629.

To simulate a long duration we can execute the distribution cleanup proc with an open transaction

use distribution
begin tran
EXEC dbo.sp_MSdistribution_cleanup @min_distretention = 0, @max_distretention = 1

Removed 5999 replicated transactions consisting of 5999 statements in 1 seconds (11998 rows/sec).

The cleanup job removed the 5,999 rows we added 1 hour ago for publisher_database_id = 4 and held an X lock on page 72702 that contains the oldest rows for publisher_database_id = 4. We can see the locks using this query

select * from sys.dm_tran_locks where request_session_id = 68 and request_mode = 'X'



In another session, we can execute the following script to add new rows to publisher_database_id = 3 (pub1)

use pub1
insert test1(b,c)
values (getdate(),replicate('x',200))
go 

The log reader agent for publisher_database_id = 3 is blocked by the delete of publisher_database_id = 4, as shown in the spWhoIsActive output, because page 72702 also contains the newest rows for publisher_database_id = 3


Workaround

We can avoid deleting N-1 rows by "delete TOP N" and prevent it from reading million of rows, if we modify the delete loop to iterate a multiple of N on the sp_MSdelete_publisherdb_trans stored procedure.

Code snipped for the MSrepl_commands table

select @max = count(*)/2000 from MSrepl_commands with (INDEX(ucMSrepl_commands), nolock) where
 publisher_database_id = @publisher_database_id and
 xact_seqno <= @max_xact_seqno and
 (type & ~@snapshot_bit) not in (@directory_type, @alt_directory_type) and
 (type & ~@replpost_bit) <> @scriptexec_type
 OPTION (MAXDOP 4)

WHILE @i <= @max
BEGIN

 DELETE TOP(2000) MSrepl_commands WITH (PAGLOCK) from MSrepl_commands with (INDEX(ucMSrepl_commands)) where
  publisher_database_id = @publisher_database_id and
  xact_seqno <= @max_xact_seqno and
  (type & ~@snapshot_bit) not in (@directory_type, @alt_directory_type) and
  (type & ~@replpost_bit) <> @scriptexec_type
  OPTION (MAXDOP 1)
  
 select @row_count = @@rowcount
 -- Update output parameter
 select @num_commands = @num_commands + @row_count
    
 set @i=@i+1
END

Code snipped for the MSrepl_transactions table

select @max = count(*)/5000 from MSrepl_transactions with (INDEX(ucMSrepl_transactions), nolock) where
 publisher_database_id = @publisher_database_id and
 xact_seqno <= @max_xact_seqno and
 xact_seqno <> @last_xact_seqno and
 xact_seqno <> @last_log_xact_seqno
 OPTION (MAXDOP 4)

WHILE @i <= @max
BEGIN
 exec dbo.sp_MSdelete_dodelete @publisher_database_id, 
  @max_xact_seqno, 
  @last_xact_seqno, 
  @last_log_xact_seqno,
  @has_immediate_sync


 select @row_count = @@rowcount

 -- Update output parameter
 select @num_transactions = @num_transactions + @row_count
   
 set @i=@i+1
END

Note that Microsoft doesn't support modifying the system stored procedures, so test it at your own risk