Using .Net, we need to write more code but it is faster. Using -OutputAs DataTable, we can write much less code but it takes longer since it need to load the table in memory. Note -OutputAs DataRows is the default which is even more slower since it will copy one row at a time
Repro
On server1, create a view to easily generate 6.4 million rows
create view my_spt_values as select c1.* from spt_values c1 cross join spt_values c2
On server2, create a stage table
select * into test_stage from spt_values where 1=2
Create the scripts below to copy from server1 to server2
Sqlclient.ps1
$conTargetString = "Data Source=server2;Initial Catalog=mydb;Integrated Security=True" $conTarget = New-Object System.Data.SQLClient.SQLConnection($conTargetString) $conTarget.Open() $conBulk = New-Object System.Data.SQLClient.SQLBulkCopy($conTargetString, [System.Data.SQLClient.SqlBulkCopyOptions]::TableLock) $conBulk.DestinationTableName = 'test_stage' $conBulk.BatchSize = 10000000 $cmdTarget = New-Object System.Data.SQLClient.SQLCommand $cmdTarget.Connection = $conTarget $cmdTarget.CommandText = "truncate table test_stage" $cmdTarget.ExecuteNonQuery() $conSourceString = "Data Source=server1;Initial Catalog=mydb;Integrated Security=True" $conSource = New-Object System.Data.SQLClient.SQLConnection($conSourceString) $conSource.Open() $cmdSource = New-Object System.Data.SQLClient.SQLCommand $cmdSource.CommandText = "select * from my_spt_values" $cmdSource.Connection = $conSource $tabSource = $cmdSource.ExecuteReader() $conBulk.WriteToServer($tabSource) $conSource.Close() $conSource.Dispose() $conTarget.Close() $conTarget.Dispose()
Readcmd.ps1
Invoke-Sqlcmd -ServerInstance server2 -Database mydb -Query "truncate table test_stage" Read-SqlViewData -ServerInstance server1 -Database master -SchemaName dbo -ViewName my_spt_values -OutputAs DataTable | Write-SqlTableData -ServerInstance server2 -Database myfb -SchemaName dbo -TableName test_stage -PassthruSqlcmd.ps1
Invoke-Sqlcmd -ServerInstance server2 -Database mydb -Query "truncate table test_stage" Invoke-Sqlcmd -ServerInstance server1 -Database master -Query "select * from my_spt_values" -OutputAs DataTables | Write-SqlTableData -ServerInstance server2 -Database mydb -SchemaName dbo -TableName test_stage -Passthru
Measure
Measure-Command -Expression {.\sqlclient.ps1} | select seconds Measure-Command -Expression {.\readcmd.ps1} | select seconds Measure-Command -Expression {.\sqlcmd.ps1} | select seconds Seconds ------- 12 36 45
Reading the Data
While measuring, open task manager to see the memory consumption of each of the scrips, we will see that the first method did not load the entire result set to memory while the latter 2 methods did
On server1, a SQL trace will show the duration to select the same amount rows. The first method is the fasted with a 12 seconds. The last 2 methods take longer (18 and 24 seconds respectively) since they have to wait until the rows are loaded into memory, so they will hold a share lock on the table longer too, which may cause blocking in the source if other processes are writing to it
Also sp_WhoIsActive will show the wait type of ASYNC_NETWORK_IO which means the it is waiting for the client (PowerShell) to process the rows
Writing the Data
On server2, a SQL trace, will show the other benefit of using the .Net bulkcopy: the lock table hint, so the logical reads are much more less. In addition, bulkcopy allows to control the batch size to not fill up the transaction log if there is lots of concurrent transactions
We can see that to copy a large amount of rows, the .Net data provider is much faster because it does not load the entire result set into memory, and allows more control with hints such as table lock that saves IO