Pages

Friday, May 1, 2015

Troubleshooting Data Collection (MDW) Errors

This is documented in books online

In a nutshell, data collector SSIS package errors are logged in the table sysssislog.

 
select * from msdb..sysssislog where event = 'OnError'

And dcexec.exe errors are logged in the table syscollector_execution_log_internal. We can use the view syscollector_execution_log_full but it is a bit slow to query.

 
select * from msdb.dbo.syscollector_execution_log_internal with(nolock) where failure_message is not null

The most common SSIS errors are shown below and they may cause the dcexec.exe to hang. Thus, those errors can be resolved by stopping the data collection jobs, killing any rogue dcexec.exe processes, deleting all cache files, and restarting the jobs. These tasks were scripted in the previous post.

Connection timeout error

Sysssislog messages:

SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "Login timeout expired".
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "Unable to complete login process due to delay in login response".
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "Shared Memory Provider: Timeout error [258]. ".

Syscollector_execution_log_internal message:

Failed to create kernel event for collection set: {49268954-4FD4-4EB6-AA04-CD59D9BB5714}. Inner Error ------------------>
Cannot create a file when that file already exists.

Deadlock error

Sysssislog messages:

SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "Unspecified error".
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "Transaction (Process ID 106) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.".

Syscollector_execution_log_internal message:

Failed to create kernel event for collection set: {2DC02BD6-E230-4C05-8516-4E8C0EF21F95}. Inner Error ------------------>
Cannot create a file when that file already exists.

Corrupt cache file error

Sysssislog messages:

The binary field is too large. The adapter attempted to read a binary field that was 134220032 bytes long, but expected a field no longer than 64 bytes at offset 57711. This usually occurs when the input file is not valid. The file contains a string length that is too large for the buffer column.
SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.  The PrimeOutput method on component "RFS - Read Current Upload Data" (1) returned error code 0x80004005.  The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.  There may be error messages posted before this with more information about the failure.

Syscollector_execution_log_internal message:

SSIS error. Component name: DFT - Upload collection snapshot, Code: -1073450952, Subcomponent: SSIS.Pipeline, Description: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on component "RFS - Read Current Upload Data" (1) returned error code 0x80004005. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code

What does the error “Cannot create a file when that file already exists” mean?

If we enable additional tracing, as shown in the previous post, we can see that there was an unhandled exception pointing to the line in the source code file that does not exist in our machine. However the root cause was the previous error.

DataCollectorController!766c!4318!2015/04/27!09:49:43:: e             ERROR: SSIS Error. Code: -1071636471, Subcomponent: Connection manager "ConfigConnection", Description: SSIS Error Code DTS_E_OLEDBERROR.  An OLE DB error has occurred. Error code: 0x80004005.
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "Login timeout expired".
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "Unable to complete login process due to delay in prelogin response".
An OLE DB record is available.  Source: "Microsoft SQL Server Native Client 11.0"  Hresult: 0x80004005  Description: "TCP Provider: Timeout error [258]. ".

DataCollectorController!766c!1208!2015/04/28!05:05:20:: e          ERROR: Generic Error: retCode=0x5, function CDataCollectorController::ExecuteMasterPackage, line 1164, file e:\sql11_main_t\sql\mpu\shared\dc\runtime\controller\src\datacollectorcontroller.cpp
DataCollectorController!766c!1208!2015/04/28!05:05:20:: e          ERROR: The master package exited with error, previous error messages should explain the cause.

DataCollectorController!766c!7e18!2015/04/28!05:05:30:: e        ERROR: Win32 Error: GetLastError=183, retCode=0x4, function CDataCollectorController::CreateControlEvents, line 476, file e:\sql11_main_t\sql\mpu\shared\dc\runtime\controller\src\datacollectorcontroller.cpp
DataCollectorController!766c!7e18!2015/04/28!05:05:30:: e        ERROR: Failed to create kernel event for collection set: {49268954-4FD4-4EB6-AA04-CD59D9BB5714}. Inner Error ------------------>
Cannot create a file when that file already exists.