Monday, July 20, 2015

change data capture (CDC) and change tracking (CT)

Track Data Changes (SQL Server) (supported by 2008 and above)

SQL Server 2014 provides two features that track changes to data in a database: change data capture and change tracking. These features enable applications to determine the DML changes (insert, update, and delete operations) that were made to user tables in a database. Change data capture and change tracking can be enabled on the same database;
Using change data capture or change tracking in applications to track changes in a database, instead of developing a custom solution, has the following benefits:

·         There is reduced development time. Because functionality is available in SQL Server 2014
·         Schema changes are not required. You do not have to add columns, add triggers, or create side table in which to track deleted rows or to store change tracking information if columns cannot be added to the user tables
·         There is a built-in cleanup mechanism.
·         Functions are provided to obtain change information.
·         There is low overhead to DML operations. Synchronous change tracking will always have some overhead. However, using change tracking can help minimize the overhead. The overhead will frequently be less than that of using alternative solutions.
·         Change tracking is based on committed transactions. The order of the changes is based on transaction commit time
·         Standard tools are available that you can use to configure and manage. SQL Server 2014 provides standard DDL statements, SQL Server Management Studio, catalog views, and security permissions.

NB : All properties of change data capture and change tracking are same except CDC captures historical data also.

CDC provides historical change information for a user table by capturing both the fact that DML changes were made and the actual data that was changed. Changes are captured by using an asynchronous process that reads the transaction log and has a low impact on the system.

Check if database is already enabled for CDC :
USE master
SELECT [name], database_id, is_cdc_enabled
FROM sys.databases
Enable CDC for database;
USE AdventureWorks
EXEC sys.sp_cdc_enable_db

CT captures the fact that rows in a table were changed, but does not capture the data that was changed. This enables applications to determine the rows that have changed with the latest row data being obtained directly from the user tables. Therefore, change tracking is more limited in the historical questions it can answer compared to change data capture. However, for those applications that do not require the historical information, there is far less storage overhead because of the changed data not being captured. A synchronous tracking mechanism is used to track the changes. This has been designed to have minimal overhead to the DML operations.

Enable Change Tracking at the database level and then at the table level. When enabling Change Tracking at the database level, you can use the CHANGE_RETENTION (default 2 days) clause.





System tables which provide information about Change Tracking. sys.change_tracking_databases shows a row for each database if Change Tracking is enabled for it whereas sys.change_tracking_tables shows a row for each table if it has Change Tracking enabled in the current database only.

SQL Server creates an internal tracking table, if you want to know more about it you can query the sys.internal_tables system table.

SELECT * FROM sys.change_tracking_databases
SELECT * FROM sys.change_tracking_tables
SELECT * FROM sys.internal_tables
WHERE parent_object_id = OBJECT_ID('Employee')

use the CHANGE_TRACKING_CURRENT_VERSION function to get the current version number at the database level, possibly the higher boundary for retained change information. The other function, CHANGE_TRACKING_MIN_VALID_VERSION gives the minimum version after the information for a table change has been retained or lower boundary for a table change information
CHANGETABLE function to retrieve change information after version 0

After performing DML operation on table.


·         SYS_CHANGE_VERSION: It represents the last version number when a particular row was updated.
·         SYS_CHANGE_CREATION_VERSION: It represents the version number when a record was inserted. It will not be over-written as in case of SYS_CHANGE_VERSION.
·         SYS_CHANGE_OPERATION : It represents the DML operations (I=INSERT, U=UPDATE and D=DELETE)
·         SYS_CHANGE_COLUMNS: It represents all the columns impacted since last baseline version. This column will have values only for UPDATE operations and if columns are not impacted during update it will have NULL
·         : It represents the primary key columns of your tracked user table. You can join your tracked user table with this columns to get only changed data from the tracked user table.

You need to have a primary key on a table in order to enable Change Tracking. If you try to drop a primary key constraint after enabling Change Tracking, you will get an error.
  • Primary key update is not treated as update, but rather it will be treated as deletion of old value and insertion of new value in case of Change Tracking.
  • You need to enable Change Tracking on a database level first then you can enable it on a table level.
  • In order to disable Change Tracking on a database level, you need to disable it on all tables, within specified database, if they have Change Tracking enabled.
  • Change Tracking works properly only if a database has a compatibility level 90 or higher.
  • Similar to the Filtered Index, you cannot specify a filter predicate on a table for Change Tracking.
  • Column adds and drops will not be tracked (no DDL), only the updates to the column are tracked.
  • If you perform TRUNCATE on Change Tracking enabled table, no tracking is done for the records deleted, but rather a minimum valid version is updated and your application data will need to be re-initialized.
  • Change Tracking puts extra overhead on DML operations as additional information is also recorded as part of the DML operation. Though it has been designed to have minimal impact compared to other alternatives to track changes.
  • An internal table is created (created in the same file group as the user table) for each Change Tracking enabled table. Also one internal transaction table is created in the database which stores one row for each committed transaction. Hence you will have additional storage requirements for these internal tables.


No comments:

Post a Comment