Pages

Wednesday, August 15, 2018

The Nulls in the Infamous Error "The Row Was Not Found ..."

While I was troubleshooting an issue, I wondered why there were nulls on the error:

"The row was not found at the Subscriber when applying the replicated (null) command for Table '(null)' with Primary Key(s): (null) (Source: MSSQLServer, Error number: 20598)"

After a little research I found out that since SQL 2012 SP4, sp_MSreplraiserror and sp_MSdel_dbo% have 3 additional parameters used in the raise error 20598 command, and they will show null depending on the SQL version of the subscriber or if the subscriber was upgraded in place

sp_MSreplraiserror

It now expects 3 parameters in the raise error 20598 command, as shown below. It gets called in the subscriber by the delete stored procedure for the respective table, e.g. sp_MSdel_dbotb1. Thus, if the publisher is SQL 2016 SP2 and the subscriber is SQL 2012 SP3, then the error message will show nulls

Before

create procedure sys.sp_MSreplraiserror @errorid int, @param1 sysname = null, @param2 sysname= null
as
    if @errorid = 20508 raiserror (20508, 11, 1)
    ...
    else if @errorid = 20598 raiserror (20598, 16, 1)

After

create procedure sys.sp_MSreplraiserror @errorid int, @param1 sysname = null, @param2 sysname= null, @param3 int = null
as
    if @errorid = 20508 raiserror (20508, 11, 1)
    ...
    else if @errorid = 20598 raiserror (20598, 16, 1, @param3, @param1, @param2)

Workaround

Since it is a system stored procedure there is no other option than upgrading the subscriber

sp_MSdel_dbo%

It now passes values to the 3 parameters in sp_MSreplraiserror, as shown below. Thus, if the subscriber was upgraded in place e.g from SQL 2012 SP3 to SQL 2016 SP2, then the error message with show nulls

Before

create procedure [dbo].[sp_MSdel_dbotb1] (@pkc1 int)
as
begin
delete [dbo].[tb1]
where [sno] = @pkc1
if @@rowcount = 0
    if @@microsoftversion>0x07320000
        exec sp_MSreplraiserror @errorid=20598
end

After

create procedure [dbo].[sp_MSdel_dbotb1] (@pkc1 int)
as
begin
declare @primarykey_text nvarchar(100) = ''
delete [dbo].[tb1]
where [sno] = @pkc1
if @@rowcount = 0
    if @@microsoftversion>0x07320000
    Begin
        if exists (Select * from sys.all_parameters where object_id = OBJECT_ID('sp_MSreplraiserror') and [name] = '@param3')
        Begin
        set @primarykey_text = @primarykey_text + '[sno] = ' + convert(nvarchar(100),@pkc1,1)
        exec sp_MSreplraiserror @errorid=20598, @param1=N'[dbo].[tb1]', @param2=@primarykey_text, @param3=13234
        End
        Else
        exec sp_MSreplraiserror @errorid=20598
    End
end

Workaround

Redeploy the replication stored procedures using sp_scriptpublicationcustomprocs

If the 20598 error message shows nulls, we can still use the transaction sequence number as outlined here to get more info. But it is nice to have the values in the error message to easily find the culprit, example:

"The row was not found at the Subscriber when applying the replicated DELETE command for Table '[dbo].[tb1]' with Primary Key(s): [sno] = 3 (Source: MSSQLServer, Error number: 20598)"