This post explains:
how to quickly estimate CT growth,
how the cleanup job actually deletes rows,
and the three main tuning knobs to keep CDC stable.
Quick Health Check: Approximate CT Row Counts
To get a fast row-count estimate of each CDC Change Table:
CDC Cleanup Defaults (Retention + Threshold)
SELECT c.object_id, t.name, p.rows FROM cdc.change_tables c JOIN sys.tables t ON c.object_id = t.object_id JOIN sys.partitions p ON t.object_id = p.object_id WHERE p.index_id IN (0,1) ORDER BY c.object_id; Example output: object_id name rows ----------- --------------------- -------- 82099333 dbo_customer_CT 167702 98099390 dbo_district_CT 270010 114099447 dbo_item_CT 0 370100359 dbo_new_order_CT 134634 386100416 dbo_warehouse_CT 135522 562101043 dbo_order_line_CT 2018297 578101100 dbo_stock_CT 1343322 1973582069 dbo_orders_CT 191732 -------------------------------------------- Total 4,261,219
CDC cleanup behavior is driven mainly by:
- Retention (how long CT rows are kept)
- Threshold (rows deleted per batch)
Check current settings:
EXEC sys.sp_cdc_help_jobs; Example output: job_type job_name retention threshold -------- ----------------- --------- --------- capture cdc.tpcc_capture 0 0 cleanup cdc.tpcc_cleanup 4320 5000Defaults: Retention = 4320 minutes (72 hours / 3 days)
Threshold = 5000 rows per delete batch
What the Cleanup Job Actually Does
Extended Events typically reveals that the cleanup job processes CTs sequentially via a cursor:
DECLARE #hchange_table CURSOR LOCAL FAST_FORWARD FOR SELECT capture_instance, start_lsn FROM [cdc].[change_tables] WHERE (@capture_instance IS NULL) OR (capture_instance = @capture_instance);Because cdc.change_tables is clustered by object_id, it tends to delete CTs in object_id order. With @p1 = 5000 (threshold), you’ll often see patterns like:
DELETE TOP (@p1) FROM [cdc].[dbo_customer_CT] WHERE __$start_lsn @p2 (35 times) DELETE TOP (@p1) FROM [cdc].[dbo_district_CT] WHERE __$start_lsn @p2 (56 times) ... DELETE TOP (@p1) FROM [cdc].[dbo_stock_CT] WHERE __$start_lsn @p2 (323 times)Large tables can dominate runtime and prevent cleanup from ever catching up.
The Three Tuning Knobs
1) Adjust the Threshold
Higher threshold = deletes more per batch (often more efficient), but can increase contention.
Lower threshold = smaller deletes, but more loops and potentially longer runtime.
Example:
EXEC sys.sp_cdc_change_job @job_type = N'cleanup', @threshold = 2000;2) Reduce Retention
Reducing retention means less data to keep, making cleanup easier. But consumers must be able to ingest changes within the retention window.
Example (2160 minutes = 36 hours):
EXEC sys.sp_cdc_change_job @job_type = N'cleanup', @retention = 2160;3) Run Cleanup More Frequently
First, find the schedule ID:
USE msdb; GO SELECT j.name AS job_name, s.schedule_id, s.name AS schedule_name FROM dbo.sysjobs j JOIN dbo.sysjobschedules js ON j.job_id = js.job_id JOIN dbo.sysschedules s ON js.schedule_id = s.schedule_id WHERE j.name = N'cdc.tpcc_cleanup';Then update it (e.g., every 15 minutes):
USE msdb; GO EXEC dbo.sp_update_schedule @schedule_id = 171, -- from query above @enabled = 1, @freq_type = 4, -- daily @freq_interval = 1, @freq_subday_type = 4, -- minutes @freq_subday_interval = 15, @active_start_time = 000000; -- midnightLast Resort: Truncating CT Tables (High Risk)
If cleanup still can’t keep up, truncating CT tables may be the fastest recovery path—but it irreversibly deletes change history.
Safe sequence:
- Stop the capture job
- Ensure consumers ingest remaining changes
- Truncate CT tables
- Restart capture job
Warnings:
- You may create data gaps for downstream consumers.
- Consumers may need a full reload/re-baseline after truncation.
- Do this only with stakeholder approval and a clear recovery plan.