Pages

Monday, September 30, 2019

Using the .Net Oracle Data Access Client

There is a cmdlet for SQL Server databases called Invoke-Sqlcmd that will allow you to run TSQL commands but there is not a cmdlet for Oracle. This is where the .Net framework comes to the rescue.

Microsoft no longer provides the Oracle library. So we need to download it from the Oracle website link below

https://www.oracle.com/database/technologies/dotnet-odacmsi-vs2017-downloads.html

Then, we can use Add-Type to add the .Net class to a PowerShell session, and use a basic .Net framework syntax to execute database commands, as shown below.


Add-Type -Path 'C:\Program Files (x86)\Oracle Developer Tools for VS2017\odp.net\managed\common\Oracle.ManagedDataAccess.dll'

$conString = "User Id=myuser;Password=mypassword;Data Source=mydatabase"
$con = New-Object Oracle.ManagedDataAccess.Client.OracleConnection
$con.ConnectionString = $conString
$con.Open()

$cmd = New-Object Oracle.ManagedDataAccess.Client.OracleCommand
$cmd.Connection = $con
$cmd.CommandText = 'select banner from v$version'
$rs = $cmd.ExecuteReader()

while ($rs.Read()) 
{
    Write-Host "$($rs.GetValue(0))"
}

$con.Close()
$con.Dispose()


Friday, July 26, 2019

Azure DevOps Build Queue Wait Time

Azure DevOps is used to automate CI/CD pipelines. It uses an agent to execute one job at a time. If we have more jobs to execute than agents then some jobs have to wait in the queue. This post is about how to gather metrics about the wait time to decide when to add more agents.

Agents are organized into pools and the data needed is available via REST API.  The URI to list the pools is

https://dev.azure.com/{your organization}/_apis/distributedtask/pools

and the URI to list the jobs running on each pool is

https://dev.azure.com/{your organization}/_apis/distributedtask/pools/{pool id}/jobrequest

To authenticate to Azure DevOps, you will need your access token and replace it in the PowerShell script below

$token = "{your access token}"
$bytes = [System.Text.Encoding]::UTF8.GetBytes(":$($token)")
$base64bytes = [System.Convert]::ToBase64String($bytes)
$headers = @{ "Authorization" = "Basic $base64bytes"}

$uri = 'https://dev.azure.com/{your organization}/_apis/distributedtask/pools/{pool id}/jobrequests'

$r = Invoke-RestMethod -Uri $uri -Headers $headers -Method Get -ContentType "application/json"


The result is a PSCustomObject with other nested PSCustomObject as Get-Member shows, so  expressions are needed to query the nested objects for example reservedAgent. The time fields such as queueTime are strings, so expressions are also required to convert to universal time

$r.value | gm

   TypeName: System.Management.Automation.PSCustomObject

Name                   MemberType   Definition
----                   ----------   ----------
Equals                 Method       bool Equals(System.Object obj)                     
GetHashCode            Method       int GetHashCode()                                  
GetType                Method       type GetType()                                     
ToString               Method       string ToString()                                  
agentDelays            NoteProperty Object[] agentDelays=System.Object[]               
assignTime             NoteProperty string assignTime=2019-07-19T21:40:26.8666667Z     
data                   NoteProperty System.Management.Automation.PSCustomObject
definition             NoteProperty System.Management.Automation.PSCustomObject
demands                NoteProperty Object[] demands=System.Object[]                   
hostId                 NoteProperty string hostId=22399273-0444-4322-aea9-2e628bac6c60 
jobId                  NoteProperty string jobId=12f1170f-54f2-53f3-20dd-22fc7dff55f9  
lockedUntil            NoteProperty string lockedUntil=2019-07-19T22:05:33.4833333Z    
matchesAllAgentsInPool NoteProperty bool matchesAllAgentsInPool=True                   
orchestrationId        NoteProperty string orchestrationId=d476f3d0-c2b4-41d5-9ebb
owner                  NoteProperty System.Management.Automation.PSCustomObject
planId                 NoteProperty string planId=d476f3d0-c2b4-41d5-9ebb-789eb9704ca7 
planType               NoteProperty string planType=Build                              
poolId                 NoteProperty int poolId=9                                       
queueTime              NoteProperty string queueTime=2019-07-19T21:40:26.4633333Z      
receiveTime            NoteProperty string receiveTime=2019-07-19T21:40:29.379468Z     
requestId              NoteProperty int requestId=1024                                 
reservedAgent          NoteProperty System.Management.Automation.PSCustomObject 
scopeId                NoteProperty string scopeId=0020fc51-3569-4cdc-8c2b-2affa5401996
serviceOwner           NoteProperty string serviceOwner=00025394-6065-48ca-87d


The final script parses the required info and filters out Pool Maintenance jobs and data older than 3 hours using Where-Object.

$r.value `
| Select-Object requestId, poolId, result,`
  @{Label=”name”; Expression={$_.definition.name}},`
  @{Label=”agent”; Expression={$_.reservedAgent.name}},`
  @{Label=”queued”; Expression={(get-date $_.queueTime).ToUniversalTime()}},` 
  @{Label=”assigned”; Expression={(get-date $_.assignTime).ToUniversalTime()}},` 
  @{Label=”received”; Expression={(get-date $_.receiveTime).ToUniversalTime()}},` 
  @{Label=”finished”; Expression={(get-date $_.finishTime).ToUniversalTime()}}`
| Where-Object {$_.name -ne 'PoolMaintenance' `
  -and $_.queued -gt (Get-Date).AddHours(-3).ToUniversalTime()}`
| Select-Object requestId, poolId, result, name, agent, queued,`
  @{Label=”wait”; Expression={$_.assigned-$_.queued}},`
  @{Label=”duration”; Expression={$_.finished-$_.received}}`
| Format-Table

requestId poolId result    name              agent   queued                wait     duration
--------- ------ ------    ----              -----   ------                ----     --------
     1023      9 succeeded project-for-demo2 AGENT01 7/19/2019 8:00:31 PM  00:33:40 00:00:25
     1022      9 succeeded project-for-demo1 AGENT01 7/19/2019 7:55:04 PM  00:00:00 00:38:03


From the results above, request id 1023 was waiting 33 minutes for request id 1022 to complete. If a 33 minute wait is not acceptable then a second agent can be added to process the job. Also this data can be collected and stored for trend analysis.

Friday, June 28, 2019

SqlServerDsc Account

The PowerShell SqlServerDsc module uses the account [NT AUTHORITY\SYSTEM] to login to SQL Server by default. However, I was puzzled by why it can execute admin commands since it was not in the sysadmin role. It turns out that [NT AUTHORITY\SYSTEM] is like a member of the group [NT SERVICE\Winmgmt] and it will use the group permissions.

For example, to enable CLR on the named instance SERVER01\SQL01, we can use this script

[DSCLocalConfigurationManager()]
Configuration LcmPush 
{    
    Node SERVER01
    {
        Settings 
        {
            AllowModuleOverwrite = $True
            ConfigurationMode = 'ApplyOnly'
            RefreshMode = 'Push'
        }
    }
}

Configuration SqlServerConfig
{
    Import-DscResource -ModuleName PSDesiredStateConfiguration
    Import-DscResource -ModuleName SqlServerDsc

    node SERVER01
    {
        SqlServerConfiguration clr
        {
            ServerName     = 'SERVER01'
            InstanceName   = 'SQL01'
            OptionName     = 'clr enabled'
            OptionValue    = 1
        }
    }
}

$MofPath = 'C:\DSC\LCM'
LcmPush -OutputPath $MofPath
Set-DscLocalConfigurationManager -Path $MofPath -Force 

SqlServerConfig -OutputPath $MofPath
Start-DscConfiguration -ComputerName 'SERVER01' -Path $MofPath -Force -Wait -Verbose 

A SQL trace will show that the account used is [NT AUTHORITY\SYSTEM]



If we remove [NT AUTHORITY\SYSTEM] from the sysadmin server role to and disable CLR

ALTER SERVER ROLE [sysadmin] DROP MEMBER [NT AUTHORITY\SYSTEM]
GO
sp_configure 'clr enabled', 0
reconfigure
GO

And execute the script again, it will still succeed

VERBOSE: [SERVER01]: LCM:  [ Start  Set      ]  [[SqlServerConfiguration]clr]
VERBOSE: [SERVER01]:                            [[SqlServerConfiguration]clr] Found PowerShell module SqlServer already imported in the session.
VERBOSE: [SERVER01]:                            [[SqlServerConfiguration]clr] Connected to SQL instance 'SERVER01\SQL01'.
VERBOSE: [SERVER01]:                            [[SqlServerConfiguration]clr] Configuration option 'clr enabled' has been updated to value '1'.
VERBOSE: [SERVER01]:                            [[SqlServerConfiguration]clr] The option was changed without the need to restart the SQL Server instance.
VERBOSE: [SERVER01]: LCM:  [ End    Set      ]  [[SqlServerConfiguration]clr]  in 0.3290 seconds.

If we remove [NT SERVICE\Winmgmt] from the sysadmin server role to and disable CLR

ALTER SERVER ROLE [sysadmin] DROP MEMBER [NT SERVICE\Winmgmt]
GO
sp_configure 'clr enabled', 0
reconfigure
GO

And execute the script again, it will error out

Exception calling "Alter" with "0" argument(s): "Alter failed. "
    + CategoryInfo          : NotSpecified: (:) [], CimException
    + FullyQualifiedErrorId : FailedOperationException
    + PSComputerName        : SERVER01
 
The PowerShell DSC resource '[SqlServerConfiguration]clr' with SourceInfo '::22::9::SqlServerConfiguration' threw one or more non-terminating errors while running the Set-TargetResource 
functionality. These errors are logged to the ETW channel called Microsoft-Windows-DSC/Operational. Refer to this channel for more details.
    + CategoryInfo          : InvalidOperation: (:) [], CimException
    + FullyQualifiedErrorId : NonTerminatingErrorFromProvider
    + PSComputerName        : SERVER01
The SendConfigurationApply function did not succeed.
    + CategoryInfo          : NotSpecified: (root/Microsoft/...gurationManager:String) [], CimException
    + FullyQualifiedErrorId : MI RESULT 1

There is very little documentation of what [NT SERVICE\Winmgmt] is for, but the experiment above shows that [NT AUTHORITY\SYSTEM] behaves like a member of the group [NT SERVICE\Winmgmt] even though is not actually a group

Friday, May 24, 2019

Copy Data with PowerShell Benchmark

To copy a large amount of rows from one SQL Server (server1) to another (server2), I compared the .Net data provider I used back with PS 2.0, with the new sqlserver module -OutputAs DataTable. The results are shown below.

Using .Net, we need to write more code but it is faster. Using -OutputAs DataTable, we can write much less code but it takes longer since it need to load the table in memory. Note -OutputAs DataRows is the default which is even more slower since it will copy one row at a time

Repro

On server1, create a view to easily generate 6.4 million rows

create view my_spt_values
as
select c1.* from spt_values c1 cross join spt_values c2

On server2, create a stage table

select * into test_stage from spt_values where 1=2

Create the scripts below to copy from server1 to server2

Sqlclient.ps1

$conTargetString = "Data Source=server2;Initial Catalog=mydb;Integrated Security=True"

$conTarget = New-Object System.Data.SQLClient.SQLConnection($conTargetString)
$conTarget.Open()

$conBulk = New-Object System.Data.SQLClient.SQLBulkCopy($conTargetString, [System.Data.SQLClient.SqlBulkCopyOptions]::TableLock)
$conBulk.DestinationTableName = 'test_stage'
$conBulk.BatchSize = 10000000

$cmdTarget = New-Object System.Data.SQLClient.SQLCommand
$cmdTarget.Connection = $conTarget

$cmdTarget.CommandText = "truncate table test_stage"
$cmdTarget.ExecuteNonQuery()

$conSourceString = "Data Source=server1;Initial Catalog=mydb;Integrated Security=True"
$conSource = New-Object System.Data.SQLClient.SQLConnection($conSourceString)
$conSource.Open()

$cmdSource = New-Object System.Data.SQLClient.SQLCommand
$cmdSource.CommandText = "select * from my_spt_values"
$cmdSource.Connection = $conSource

$tabSource = $cmdSource.ExecuteReader()
$conBulk.WriteToServer($tabSource)

$conSource.Close()
$conSource.Dispose()

$conTarget.Close()
$conTarget.Dispose() 

Readcmd.ps1

Invoke-Sqlcmd -ServerInstance server2 -Database mydb -Query "truncate table test_stage"

Read-SqlViewData -ServerInstance server1 -Database master -SchemaName dbo -ViewName my_spt_values -OutputAs DataTable |
Write-SqlTableData -ServerInstance server2 -Database myfb -SchemaName dbo -TableName test_stage -Passthru 

Sqlcmd.ps1

Invoke-Sqlcmd -ServerInstance server2 -Database mydb -Query "truncate table test_stage"

Invoke-Sqlcmd -ServerInstance server1 -Database master -Query "select * from my_spt_values" -OutputAs DataTables |
Write-SqlTableData -ServerInstance server2 -Database mydb -SchemaName dbo -TableName test_stage -Passthru 

Measure

Measure-Command -Expression {.\sqlclient.ps1} | select seconds
Measure-Command -Expression {.\readcmd.ps1} | select seconds
Measure-Command -Expression {.\sqlcmd.ps1} | select seconds


Seconds
-------
     12
     36
     45

Reading the Data

While measuring, open task manager to see the memory consumption of each of the scrips, we will see that the first method did not load the entire result set to memory while the latter 2 methods did


On server1, a SQL trace will show the duration to select the same amount rows. The first method is the fasted with a 12 seconds. The last 2 methods take longer (18 and 24 seconds respectively) since they have to wait until the rows are loaded into memory, so they will hold a share lock on the table longer too, which may cause blocking in the source if other processes are writing to it



Also sp_WhoIsActive will show the wait type of ASYNC_NETWORK_IO which means the it is waiting for the client (PowerShell) to process the rows


Writing the Data

On server2, a SQL trace, will show the other benefit of using the .Net bulkcopy: the lock table hint, so the logical reads are much more less. In addition, bulkcopy allows to control the batch size to not fill up the transaction log if there is lots of concurrent transactions


We can see that to copy a large amount of rows, the .Net data provider is much faster because it does not load the entire result set into memory, and allows more control with hints such as table lock that saves IO

Friday, April 5, 2019

Analyzing SQL Error Logs with Kusto

We can leverage Azure Kusto to analyze SQL error logs. It is easy to learn if you know SQL querying, in fact here is a link on how SQL translates to Kusto. My favorite two operators to reduce large amount of errors to a small number of patterns are: autocluster and reduce

Autocluster

It will find a common pattern on the given columns. In the example below, 60% of the column values in host, source_name, and message, are from replication errors on a particular host



Reduce

It will find common patterns in the message and replace them with "*" to easily aggregate them. In the example below we can see the frequency of the most common errors.


Friday, March 15, 2019

Data Points Outside Time Range

I was using the cool Microsoft Azure plugin for Grafana to easily develop a dashboard for SQL error logs in Azure Kusto, and noticed the message "Data Points Outside Time Range" even though my data points were on range. After trial and error, I found out that if I add "asc" to the "order by" the message will disappear

Example of the original Kusto query where the message "Data Points Outside Time Range" appears



By adding "asc" to the "order by" the message will disappear


The reason is quite silly. I was learning Kusto from a TSQL background. In Kusto the "order by" defaults to descending order while in TSQL it defaults to ascending order.

Tuesday, February 5, 2019

Monitoring Identity Column Values

Identity columns are often used for generating key values. Once create it, we often forget about it. However, once the max value allowed by the data type is reached, no more rows can be inserted.

To avoid unexpected outage, we can use sys.identity_columns to monitor if the values are approaching the max limit.  The 6 data types supported are: tinyint, smallint, int, bigint, numeric, and decimal.

For example, we can create 3 tables with identity columns, assign a seed value, and insert some data

create table paul_test1 (a tinyint identity, b char(1))
create table paul_test2 (a int identity, b char(1))
create table paul_test3 (a numeric(6,0) identity, b char(1))

dbcc checkident('paul_test1', reseed, 200)
dbcc checkident('paul_test2', reseed, 2000000000)
dbcc checkident('paul_test3', reseed, 999999)

insert paul_test1 (b) values ('x')
insert paul_test2 (b) values ('x')
insert paul_test3 (b) values ('x')

If we insert one more row on the third table

insert paul_test3 (b) values ('x')

We will get this error:

Msg 8115, Level 16, State 1, Line 62 Arithmetic overflow error converting IDENTITY to data type numeric. Arithmetic overflow occurred.

We can query sys.identity_columns to find the identity column last value of a table and calculate the percent used of the positive values. Then we can run the query on each database using sp_MSforeachdb, as shown on the script below (e.g. C:\TEMP\identity_values.sql)

create table #monitor_identity(
instance_name varchar(255)
,database_name varchar(255)
,table_name varchar(255) 
,schema_name varchar(255)
,column_name varchar(255) 
,current_value numeric(38)
,max_value numeric(38)
,pct_used numeric(38,2)
)
insert #monitor_identity
exec sp_MSforeachdb '
select 
@@servername
,''?''
,object_schema_name(object_id, db_id(''?''))
,object_name(object_id, db_id(''?''))
, name
, cast(last_value as numeric(38))
, max_value = case TYPE_NAME(system_type_id)
when ''tinyint'' then 255
when ''smallint'' then 32767
when ''int'' then 2147483647
when ''bigint'' then 9223372036854775807
when ''numeric'' then cast(replicate(''9'',precision) as numeric(38))
when ''decimal'' then cast(replicate(''9'',precision) as numeric(38))
end
, pct_used = cast(last_value as numeric(38)) / case TYPE_NAME(system_type_id)
when ''tinyint'' then 255
when ''smallint'' then 32767
when ''int'' then 2147483647
when ''bigint'' then 9223372036854775807
when ''numeric'' then cast(replicate(''9'',precision) as numeric(38))
when ''decimal'' then cast(replicate(''9'',precision) as numeric(38))
end * 100
from ?.sys.identity_columns with (nolock)
where last_value is not null
'
select
instance_name  
,database_name 
,schema_name 
,table_name 
,column_name 
,current_value 
,max_value 
,pct_used 
from #monitor_identity
where pct_used > 70
order by 
database_name 
,schema_name 
,table_name 
drop table #monitor_identity

Output


To run the script in multiple instances, we can have a list of instances in a text file (e.g. C:\TEMP\instance_list.txt)

instance1
instance2

Then we can easily use a PowerShell script to pipe the instance list content to Invoke-Sqlcmd, and pipe to Out-GridView

Get-Content C:\TEMP\instance_list.txt | Invoke-Sqlcmd -InputFile C:\TEMP\identity_values.sql | Out-GridView