Auditing and Logging - make use of SSIS in SQL Server
For a mintue I was thinking why I should post in this section, the variability is as per the law applicable to Financial sector to make sure the financial statement that a company reports is legitimate and that there’s no room for someone to manipulate it. As you can imagine, preventing the manipulation of the financial statement has a lot of IT impact. Some interpret the law to mean that you must audit who loaded each row into a database and when it arrived.
So how can SQL Server features can help to accomplish the task of audit of events on a database where it is managed?
To talk in the technical terms, SQL Server Integration Services (SSIS) can help to get the required information during the ETL activities. The actual task for a company where auditing is important you can capture that type of lineage in SSIS easily using the Audit transformation. The Audit transformation adds extra columns to the data flow that contains lineage data about the package, task or system.
To add this lineage to your SSIS package, drag the Audit transform from the toolbox onto your data flow and drop it on the data flow design surface where you wish to begin auditing. How you can plan the transformation to capture is frequently to be used to capture lineage data of records that SSIS had to manipulate or that it couldn’t load. For example, for rows containing errors that couldn’t be corrected automatically, you may wish to add auditing data to the row before inserting it into an error queue as shown below. The Data Flow aspect of a package is a also important having the list of potential values specified below to your package:
|
Value |
Description |
|
Execution Instance ID |
The unique identifier (GUID) that contains the execution instance of the package. |
|
Package ID |
The GUID that contains the unique ID for the package. |
|
Version ID |
The GUID that contains the unique ID for the version of the package. |
|
Package Name |
The name of the package. |
|
Execution Start Time |
The time when the package began to execute. |
|
User Name |
The name of the user that executed the package. |
|
Task Name |
The name of the Data Flow task that contains the Audit transformation. |
|
Task ID |
The GUID for the Data Flow task that contains the Audit transformation. |
|
Machine Name |
The name of the machine where the package was executed. |
What is the configuration you need to set to error out from a transformation or connection?
SSIS has got the ability to configure how to handle errors at a row-level is one of the most significant enhancements in SSIS over DTS. In DTS, your entire package was more likely to fail if a single record contained an error (like a foreign key violation). Additionally, in DTS, you may spend hours debugging the issue and then have to rerun a five hour package. That is the reason logging the activity of a package helps to achieve where exactly it failed and error outputs easily solve this problem by allowing you to specify how you’d like SSIS to handle the error. You can choose to ignore the error, fail on the error or redirect it to a different path in the pipeline. The advantage to redirecting the row is you can attempt to cleanse the row, and, if that doesn’t work, direct it to an error queue to be manually corrected the next morning.
To configure a transform or connection to output records that have failed to a different path, you can click on Configure Error Output in most transforms or click the Error Output page in a Source or Destination. Once you have the dialog box open (shown below), specify how you’d like to handle the error. There are two different classes of errors: truncation errors and transform or connection specific errors. A truncation error occurs when you try to move more data into a column than will fit. A transform or connection specific error will vary based on the transform or connection you’re using. A source and destination might have conversion errors, while the transform might have errors specific to the transform. For example, a Lookup transform will have an error if the record could not be found in the lookup operation.
Taking care of such minor aspects of a package can get you great deliverable of logging for your ETL activities to achieve the mammoth task of audit.