Load Data using Change Data Capture (CDC)...

What is Change Data Capture (CDC)

Change Data Capture is a feature of the Database Engine which is available only in SQL Server 2008 Enterprise, Developer, and Evaluation Editions. CDC feature captures insert, update and delete operations applied to SQL Server tables and provides the details of the changes available in an easily consumed, relational format. CDC uses it's own tables to store data related to tracking tables with the same columns as in the tracking table and some additional columns to store important meta data.

CDC with SQL Server Integration Services (SSIS)

There may be a situation where we will need to store the data loaded from ETL Package to another table. That means in a situation where we maintain two tables to keep the same data. In such a situation we can use CDC.
Using CDC, we can load data to one table where CDC is enabled and execute a Stored Procedure to load data to the other table by capturing changed data using CDC.

Enabling CDC

Before enabling CDC for a table we need to enable CDC for the Database.
To that we can execute the below query

USE MyDatabase
Exec sys.sp_cdc_enable_db

Then using below query we can verify whether the CDC is enabled for MyDatabase.

SELECT is_cdc_enabled FROM sys.databases WHERE name = 'MyDatabase'

We can see the result of the above query as 1, which is the value for a CDC enabled database. Otherwise it will give the result as 0.

After that we have to enable CDC for each table. To do it we can use the below query to enable CDC for Table1.


USE MyDatabase
GO
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'Table1',
@role_name = NULL
GO

When we enable CDC for a table we can see a table is created under System Tables of the database as cdc.dbo_Table1_CT. Also there are two Table Valued Functions are created as
fn_cdc_get_all_changes_dbo_Table1 and fn_cdc_get_net_changes_dbo_Table1.


Function to Retrieve Change Data

We can write a SQL Function as given below to retrieve change data for a particular period for a particular table.


CREATE function cdc.uf_ChangeData_Table1 (
@start_time datetime
)
returns @Table1 table (
Column1 smallint
,Column2 smallint
,Column3 money
,CDC_OPERATION varchar(1)



) as
begin
declare @from_lsn binary(10), @to_lsn binary(10)

select @from_lsn = sys.fn_cdc_increment_lsn(sys.fn_cdc_map_time_to_lsn('largest less than or equal',@start_time))

select @to_lsn = sys.fn_cdc_get_max_lsn()


insert into @Table1
select
Column1,
Column2,
Column3,
case __$operation
when 1 then 'D'
when 2 then 'I'
when 4 then 'U'
else null
end as CDC_OPERATION
from
cdc.fn_cdc_get_net_changes_dbo_Table1(@from_lsn, @to_lsn, 'all')

return
end
go


In this function only the start time is given and get all the change data after that.



Stored Procedure to Execute the Function and Load the Data to the Table

Stored procedure was written to execute the earlier function to get the change data for a particular period and load those data to the destination tab. Stored Procedure is as below and it takes the input parameter as the start time and the required table.

CREATE PROCEDURE dbo.LoadChangedData (
@start_time DATETIME
,@table VARCHAR(50)
)
AS
BEGIN

IF (@fact_table='Table1')
BEGIN
MERGE INTO dbo.Table2 AS dest
USING (SELECT Column1,
Column2,
Column3 FROM cdc.uf_ChangeData_Table1(@start_time)) AS src

ON (src.Column1=dest.Coulmn1 AND src.Column2=dest.Coulmn2 )
WHEN MATCHED AND (src.CDC_OPERATION='U')
THEN
UPDATE SET dest.Column3=src.Column3

WHEN MATCHED AND (src.CDC_OPERATION='D') THEN DELETE

WHEN NOT MATCHED AND (src.CDC_OPERATION='I' OR src.CDC_OPERATION='U') THEN
INSERT (Column1,
Column2,
Column3 ) VALUES (src.Column1,
src.Column2,
src.Column3);

END
END


This stored procedure will do the insert, update or delete operations to the destination table considering the operation type.


Combining With the ETL

We can add a Execute SQL Task to Data Flow after OLEDB Destination component in the Data Flow and Execute the LoadChangedData stored procedure by providing required two parameters. We need to have the start time as the time before inserting the values to the first table. That means before the OLEDB Destination component.

This procedure will load the inserted data to an another table.


How to Combine Two Fact Tables in SSAS...

In SQL Server Analysis Services Cube, there may be a situation where we need to combine two related fact tables.
This kind of a situation may occur due to the same fact having different dimensions. For an example if we consider two dimensions as Branch and Region, in some areas Sales may happen just only in Regions and sometimes it may go to the level of Branch.

In such a scenario in the Data warehouse we need to have two different tables for those two type of sales like dbo.FactSalesRegion and dbo.FactSalesBranch.
But for analysis purposes we should have all the sales as one. Due to that we need to create one Fact by combining those two fact tables in the data warehouse.

We can do it in the Cube by creating a new fact table using Named Query.
Open the Data Source View by double clicking it in the Solution Explorer. Then right click on the design area and then select New Named Query.
Then provide a name for the combined fact table like FactSales and enter the SQL Query to get the union of dbo.FactSalesRegion and dbo.FactSalesBranch tables.

The query will looks like as below.

SELECT RegionKey, NULL AS BranchKey, Sales
FROM dbo.FactSalesRegion
UNION
SELECT NULL AS RegionKey, BranchKey, Sales
FROM dbo.FactSalesBranch

In the above query we add the non existing columns for each table as NULL and Union those tables, since those two tables which we get the union should have same columns.
The above query provides a new fact table as FactSales which has total sales.
In some complex situations this scenario helps to design the data warehouse.

There are few modifications to be done for the Cube to load data, otherwise it gives errors while processing due to those null values.
We need to modify some properties in those dimensions (in this example Region and Branch dimensions) and also need to change some settings in the dimension usage.

Double click on one of above dimension (assume DimRegion) in the Analysis Services Project. Then select the key (RegionKey) from Attributes and go to properties. In that expand the KeyColumns property and change the NullProcessing property to UnknownMember. After that you will see Key is underline by red color. To solve that issue, select DimRegion from Attributes and go to properties. In the bottom of that property list you can find a property as UnknownMember. Change the value of that property to Visible.

Then you should change some settings in the Dimension Usage area of the Cube. Double click on the Cube in the Solution Explorer and click on Dimension Usage tab. In that you can see the facts and the dimension used for those facts.
Select the RegionKey and click the button in the Right side to get the Define Relationship window. Click on Advanced button and you will get the Measure Group Bindings window.
Under Relationship change the Null Processing of the RegionKey to UnknownMember. Need to do this for all the facts which uses the dimension.

After that you can process the cube and load data.
Hopes this will help you as well...

Some KPI Values and Indicators are not shown in a PerformancePoint Scorecard...

While creating some scorecards for BI application, I faced a strange problem. That is, in that scorecard values and the indicators are not shown for some KPIs and others are ok.
The approach I used to create the scorecard is to Import SQL Server Analysis KPIs and after that add some KPIs from PerformancePoint. But the values and the indicators for the KPIs which are created by the PerformancePoint were not shown as shown in the below figure. PPSKPI is the KPI created in PerformancePoint.



I checked the procedure to create that scorecard and KPIs, and found it is ok. Finally I was able to find the reason for that issue. It is a very simple reason.
The reason is that the column names of those two KPI types are different.

For the KPIs which were imported from Analysis Services, those column names are Value and Goal and Status. KPIs which were created from PerformancePoint, those column names are Actual and Target.
When those column names were changed to Value and Goal and Status, the problem was solved...

Setting Up BI Server for Application Developments

Before start development for a Business Intelligence (BI) application, we need to have a proper environment. For that we should need to have a server which has the Windows Server version. For this post I'm using 64-bit Windows Server 2008 R2 installed server.

Server should has
  • .Net Framework 3.5
  • Internet Information Services (IIS)
  • Microsoft SQL Server 2008
  • Microsoft Office SharePoint Server 2007
  • PerformancePoint Server 2007
In the server we should configure it to install .Net Framework 3.5 and Internet Information Services (IIS).
To install IIS, go to the Server Manger and then select Add Roles as shown in the below figure.



Then you will get the Add Role Wizard as shown below



Click Next and then you will get the window to select Server Roles.
Put tick in the check box of Web Server (IIS) to install IIS.
Then click Install to install the selected roles.

After that Install the Microsoft SQL Server 2008. Then install the SQL Server 2008 Service Pack 1.
Then Install the Microsoft Office SharePoint Server 2007. In that select the Basic installation and after installation completes, configure it.

Before installing PerformancePoint Server, we need to install Ajax and some patches for SQL Server.
Those patches are list down below (for 64-bit computer).
  • sqlncli_x64
  • SQLServer2005_ADOMD_x64
  • SQLServer2005_ASOLEDB9_x64
  • SQLServer2005_XMO_x64
  • SQLSERVER2008_ASOLEDB10
You can get those from the Feature Pack for Microsoft SQL Server 2005 - December 2008 release in the http://www.microsoft.com/downloads/details.aspx?FamilyID=536fd7d5-013f-49bc-9fc7-77dede4bb075&displaylang=en link.

Finally install the PerformancePoint Server 2007.

Server is ready for the developments related BI applications...

Saving Changes is not permitted for a SQL Table...

When you try to save a change to a table in the SQL Server Databse, sometimes you will get a error message saying "Saving changes is not permitted. The changes you have made require the following tables to be dropped and re-created..." as shown in the below figure.

Due to that you will not be able to change database tables.

You can solve that issue with just a one click. You can find Tools -> Options from SQL Server Management Studio and you will get a Window as shown below. Select Design and under that you can find there is a option saying Prevent saving changes that require table re-creation is selected as shown in the below figures.

Just remove the mark of it and try to save your changes. You should be able to save it unless there are no other issues....


How to run UPDATE/INSERT/DELETE Statements on Azure SQL Database in Microsoft Fabric Notebook...

You can run UPDATE/INSERT/DELETE Statements on Azure SQL Database in Microsoft Fabric Notebook using Python SQL Driver - pyodbc.  For the Fa...