Pages

Friday, May 8, 2015

Data Collection (MDW) Errors after Mirror or Availability Group Failover

Symptoms

After mirror or availability group failover, we get the following errors on the data collection jobs on the mirror or secondary server

Sysssislog messages:

SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x80040E14.
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80040E14  Description: "The target database, 'xxxx', is participating in an availability group and is currently not accessible for queries. Either data movement is suspended or the availability replica is not enabled for read access. To allow read-only access to this and other databases in the availability group, enable read access to one or more secondary availability replicas in the group.  For more information, see the ALTER AVAILABILITY GROUP statement in SQL Server Books Online.".

OLE DB error occurred while fetching parameterized rowset. Check SQLCommand and SqlCommandParam properties.

SSIS Error Code DTS_E_PROCESSINPUTFAILED.  The ProcessInput method on component "LKUP - Look up query plans on target server" (2) failed with error code 0xC0208253 while processing input "Lookup Input" (16). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.  There may be error messages posted before this with more information about the failure.

Root Cause 

When the collector finds an “interesting” query from sys.dm_exec_query_stats, it calls the stored procedure msdb.dbo.sp_syscollector_text_query_plan_lookup to get the query plan from cache using sys.dm_exec_text_query_plan. That function will error out if the plan references a database that has the role mirror or secondary. As explained in this blog, a failover does not flush the plan cache.

Workaround

We can add error handling to the procedure to continue when the respective error occurs

 
USE [msdb]
GO
ALTER PROCEDURE [dbo].[sp_syscollector_text_query_plan_lookpup]
    @plan_handle varbinary(64),
    @statement_start_offset int,
    @statement_end_offset int
AS
BEGIN
    SET NOCOUNT ON
 BEGIN TRY
  SELECT    
   @plan_handle AS plan_handle,
   @statement_start_offset AS statement_start_offset,
   @statement_end_offset AS statement_end_offset,
   [dbid] AS database_id,
   [objectid] AS object_id,
   OBJECT_NAME(objectid, dbid) AS object_name,
   [query_plan] AS query_plan
  FROM    
   [sys].[dm_exec_text_query_plan](@plan_handle, @statement_start_offset, @statement_end_offset) dm
 END TRY
 BEGIN CATCH
  DECLARE @ErrorNumber INT = ERROR_NUMBER();
  DECLARE @ErrorSeverity INT = ERROR_SEVERITY();
  DECLARE @ErrorState INT = ERROR_STATE();
  DECLARE @ErrorMessage NVARCHAR(4000) = ERROR_MESSAGE();
  --when a plan in cache refereces a db with role = mirror or role = secondary, dm_exec_text_query_plan will return this error:
  --Msg 954, Level 14, State 1, Procedure sp_syscollector_text_query_plan_lookpup, Line 8
  --The database "xxxx" cannot be opened. It is acting as a mirror database.
  --Msg 976, Level 14, State 1, Line 5
  --The target database, 'xxxx', is participating in an availability group and is currently not accessible for queries. Either data movement is suspended or the availability replica is not enabled for read access. To allow read-only access to this and other databases in the availability group, enable read access to one or more secondary availability replicas in the group.  For more information, see the ALTER AVAILABILITY GROUP statement in SQL Server Books Online.
  IF @ErrorNumber <> 954 and @ErrorNumber <> 976
   RAISERROR(@ErrorMessage, @ErrorSeverity, @ErrorState);
 END CATCH
END

Connect item link