April 17, 2014

SQL SERVER: SSIS – Merge Join Transformation

Now, let’s have a look at functionality of Merge Join Transformation task in SSIS.

Benefit of using Merge join is, input datasets can be combination of any two datasets from (Excel file, XML file, OLEDB table, Flat file).Output can be result of INNER, LEFT Outer, or FULL Outer Join on both the datasets.

Merge Join Transformation has two inputs and one output. It does not support an error output.

Use of Merge Join Transformation:

Merge Join is a two-step process. First step is to sort both the input datasets(tables) in the same order, and the second step is apply merge join on the common key.Here rows from both the sorted inputs get matched together.

To Understand Merge Join Transformation in better way, lets take an example with various configuration parameters in SSIS.

1. Create sample tables:

Now we will create input tables named “Department” and “Employee” in Test database.

  1: CREATE TABLE Department
  2: (
  3: 	Dept_No INT
  4: 	,Dept_Name VARCHAR(50)
  5: 	,Location VARCHAR(50) 
  6: 	CONSTRAINT PK_DEPT PRIMARY KEY (Dept_No)
  7: )
  8:  
  9: INSERT INTO Department VALUES (10, 'ACCOUNTING', 'Mumbai')
 10: INSERT INTO Department VALUES (20, 'RESEARCH',   'Delhi')
 11: INSERT INTO Department VALUES (30, 'SALES',      'Mexico')
 12: INSERT INTO Department VALUES (40, 'OPERATIONS', 'Sydney')
 13: GO
 14: 
 15: CREATE TABLE Employee
 16: (	
 17: 	Emp_No INT NOT NULL
 18: 	,Emp_Name VARCHAR(100)
 19: 	,Designation VARCHAR(50)
 20: 	,Manager INT
 21: 	,JoinDate DATE DEFAULT GETDATE()
 22: 	,Salary INT
 23:     ,Dept_No INT
 24:     CONSTRAINT PK_Employee PRIMARY KEY (Emp_No)
 25:     ,CONSTRAINT FK_Dept_No FOREIGN KEY (Dept_No) REFERENCES Department(Dept_No)
 26: )
 27: 
 28: INSERT INTO Employee
 29: 		(Emp_No,Emp_Name,Designation,Manager,Salary,Dept_No)
 30:  VALUES
 31:     (101, 'Tejas', 'MANAGER', 104, 4000, 20)
 32:     ,(102, 'Michel', 'ANALYST', 101, 1600,  30)
 33:     ,(103, 'Mark', 'DEVELOPER',102, 1250,  30)
 34:     ,(104, 'James', 'DIRECTOR',106, 2975,  10)
 35:     ,(105, 'Raj', 'ANALYST',7566, 3000,  20)
 36:     ,(106, 'TechnoBrains', 'PRESIDENT', NULL, 5000, 40)
 37: GO

2. Create Data Source Connection:

Select and drag “Data Flow Task”, from “Control Flow Items” to designer surface. Then double click it and Create a New OLEDB connection.

3. Select Input Data Sources:


Select two different Data Sources which you need to perform merge join on as “OLE_SRC_Employee” and “OLE_SRC_Department”. Create a new “OLEDB Connection” to map it to the source datasets.


SQL Yoga - Merge Join Transformation #1


4. OLEDB Source Editor:


Now double click on “OLEDB Source”, it will open “OLEDB Source Editor” in that provide table configuration parameters and columns mapping from “Columns” tab.


5. Data Sorting:


As the Merge Join Transformation accepts the sorted data as input, we will add the sort transformation in the flow. If you know that the data is already sorted then you can set “isSorted” Property as “True” in the “Advanced Editor” for OLEDB Source of the respective dataset. Or else you can use the Sort Transformation task from “Data Flow” Transformation.

Now we need to add two Sort components and join the green arrow pipeline from “Employee” to one of the sort transformation and other pipeline from “Department” to the other Sort Transformation.

SQL Yoga - Merge Join Transformation #2

6. Sort Transformation Editor Source 1:

In order to get sorted data, Double click on the “Sort Transformation” that we have connected to “Employee” Dataset to provide the key on which you want to perform sort so that data gets re-ordered in sorted form based on the keys provided. Provide the Sort type as well as sort order if there are multiple keys on which Sort operation will work.

SQL Yoga - Merge Join Transformation #3

7. Sort Transformation  Editor Source 2:


Now we have “Employee” table data in sorted form, in the same way need to configure the sort transformation for Source 2 “Department”.


For the same double click on the “Sort Transformation” which is connected to “Department” dataset, to provide the Sort key and order in which you want to perform the sort in “Sort Type” property in Editor. Please keep in mind that the Sort type for both the source needs to be of the same type. i.e. any one of ascending or descending order.


SQL Yoga - Merge Join Transformation #4


8. Merge Join Task Component:


Now we will add Merge Join Transformation, so that we can join both the sources together.Drag the pipeline from Employee sort to Merge Join. In “Input Output Selection” popup select Output as “Sort Output” and Input as “Merge Join Left Input”. In Input user has two options as




  1. Merge Join Left Input


  2. Merge Join Right Input

Using this two options user can specify whether the input needs to be considered as left or right side dataset result.


SQL Yoga - Merge Join Transformation #5


Now you need to drag the pipeline from other “Sort transformation” and connect it to “Merge Join Transformation” as second input. While connecting the second input to the Merge Join, it will not ask for the input type as you have already provided it for the first pipeline, so by default it will select the other type of input to the Merge Join. i.e. Left or Right accordingly.


SQL Yoga - Merge Join Transformation #6


9. Merge Join Transformation Editor:


In order to configure merge join double click on the “Merge Join Transformation” to open the Editor.You need to provide the Join Type to specify which type of join operation you want to perform on the selected dataset.


Different Join types are:



  1. Inner Join
  2. Left Outer Join
  3. Full Outer Join

Here we will select the “Inner Join” as Join Type as we need to display data from both the datasets. Select “Dept_No” as Join Key as it is the common field on which we can merge two datasets data.


SQL Yoga - Merge Join Transformation #7


10. Result table creation:


We need to create a table to store the output result into Test database as per the script provided.

  1: CREATE TABLE [Merge_Join_Output] 
  2: (
  3:     [Emp_No] INT,
  4:     [Emp_Name] VARCHAR(100),
  5:     [Designation] VARCHAR(50),
  6:     [Manager] INT,
  7:     [JoinDate] DATE,
  8:     [Salary] INT,
  9:     [Dept_No] INT,
 10:     [Dept_Name] VARCHAR(50),
 11:     [Work_Location] VARCHAR(50)
 12: )
 13: GO

 11. Select “OLEDB Destination Editor” to redirect your output to the “Merge_Join_Output” table as shown. In “Mappings” tab map the output columns accordingly.


SQL Yoga - Merge Join Transformation #8


12. Package Execution:


Execute the package and check for the results in the “Merge_Join_Output” table.SQL Yoga - Merge Join Transformation #9


13.Result in database


After successful execution of the package, we can check the result in “Merge_Join_Output” table.


Query:

  1: -- OLEDB Table 1
  2: SELECT * FROM Employee
  3: 
  4: -- OLEDB Table 2
  5: SELECT * FROM Department
  6: 
  7: -- Output data after Merge Join Operation
  8: SELECT * FROM Merge_Join_Output
  9: GO

SQL Result:


SQL Yoga - Merge Join Transformation #10


In this way we get the Merge Join result by combining both the tables data based on common data, such that it becomes easier to navigate information from the single merged table, instead of referring two different tables and link the related data.


Reference: Tejas Shah (www.SQLYoga.com)

March 31, 2014

SQL SERVER: SSIS - Look Up Transformation Task

Today, I am going to give basic example of Lookup Transformation Task in SSIS.

Lookup transformation performs lookup operation by joining data in input columns with reference table dataset columns.Lookup can be used to access addition information from the reference dataset based on the matching criteria Reference dataset can be OLEDB table, Excel file or cache file, or SQL query result.

Use of Look up Transformation:

In my source system (table), I have all the product with their details. Somehow I have products which belongs to the country which doesn’t exist in my reference (master) table. I assigned a job to rectify those products. I need to design ETL which gives me those records whenever we import products to our target database (table). So here, I am going to use “Lookup no match output” to capture those records by following steps:

Let’s take an example to easily understand how to use Lookup Transformation in SSIS.

1. Create Source Connection:

Select and drag “Data Flow Task”, from “Control Flow Items” to designer surface. Then double click it and Create a New OLEDB connection.

2. Create sample tables

Now we will create tables named ‘LKP_Countries_Source’ and ‘LKP_Countries’ into Test Database from the given script.

  1: CREATE TABLE [LKP_Countries_Source]
  2: (
  3: 	[CountryCode] [int] NULL,
  4: 	[CountryName] [varchar](100) NULL
  5: )
  6: GO 
  7: 
  8: INSERT INTO [LKP_Countries_Source] 
  9: (	[CountryCode]
 10: 	,[CountryName])
 11: VALUES 
 12: 	(91, N'India')
 13: 	,(92, N'Pakistan')
 14: 	,(93, N'Afghanistan')
 15: 	,(94, N'Sri Lanka')
 16: 	,(95, N'Myanmar')
 17: 	,(960, N'Maldives')
 18: 	,(961, N'Lebanon')
 19: 	,(962, N'Jordan')
 20: 	,(963, N'Syrian Arab Republic')
 21: 	,(964, N'Iraq')
 22: 	,(965, N'Kuwait')
 23: 	,(966, N'Saudi Arabia')
 24: 	,(967, N'Yemen')
 25: 	,(968, N'Oman')
 26: 	,(971, N'United Arab Emirates')
 27: 	,(972, N'Israel')
 28: 	,(1, N'USA')
 29: 	,(65, N'Singapore')
 30: GO
  1: CREATE TABLE [LKP_Countries]
  2: (
  3: 	[Country] [varchar](100) NULL
  4: 	,[Code] [int] NULL
  5: )
  6: GO
  7: 
  8: INSERT [LKP_Countries] 
  9: (
 10: 	[Country]
 11: 	,[Code]
 12: ) 
 13: VALUES 
 14: 	(N'INDIA', 91)
 15: 	,(N'SINGAPORE', 65)
 16: 	,(N'USA', 1)
 17: 	,(N'PAKISTAN', 92)
 18: GO
3. Create Lookup connection:

Now you Need to select the proper OLEDB connection in “Connection Manager” tab and the source table for lookup task.

SQLYoga - Lookup Transformation Task #2  
4. Columns Selection from Source Table:

Select the columns to use as output columns.
 
SQLYoga - Lookup Transformation Task #3
5. Lookup Transformation Editor:

Here, I have added “Lookup Data Transformation” Task to designer tab and click on edit to configure the Lookup transformation.
SQLYoga - Lookup Transformation Task #4
6. Handle No Match output:

Now we need to configure various sections in the “Lookup Transformation Editor”.
In General section, select “Redirect rows to no match output” to handle the unmatched data from the lookup task.
 
SQLYoga - Lookup Transformation Task #5

Here we will need to select Cache mode as “Full Cache”. This option is used to improve the performance while handling large scale of data.
Keep connection type as “OLEDB Connection Manager”, as we are using OLEDB source. When you use Cache File as data source then you will need to select “Cache Connection Manager”
Last option provide various ways in which not matched data can be handled.

  • Ignore failure – ignores the failure and executes the next task.
  • Redirect rows to error output – moves the not matched rows to red output to handle them separately.
  • Fail component – throws an exception and stops processing further tasks.
  • Redirect rows to no match output – switches rows to the secondary output, and user can handle it differently to matching data

7. Set Connection Manager for Lookup table:

In Connection section select the reference table with proper connection. This list will get compared with source dataset for matching the data.

SQLYoga - Lookup Transformation Task #6

8. Column mapping for Lookup table:

In Columns section, select the available input columns and map it with the available lookup columns. This will create a join between 2 source datasets.We have used “Full Cache Mode”, so Advanced section will be disabled and in “Error Output” keep fields as it is.SQLYoga - Lookup Transformation Task #7

9. Input Output Selection Setup:

Select new “OLEDB Destination” transformation and drag it to the designer surface. Drag green arrow from “Lookup Transformation” Task to “OLEDB Destination” and provide “Lookup Match Output” as Output and click OK. SQLYoga - Lookup Transformation Task #8

10. Create Output table for Match and Not Matched Data:

Now we will need to create output tables to store the matched as well as not matched result.

  1: CREATE TABLE [LKP_Output_Match] 
  2: (
  3:     [CountryCode] INT,
  4:     [CountryName] VARCHAR(100),
  5:     [Country_Calling_Code] INT
  6: )
  7: GO
  8: 
  9: CREATE TABLE [LKP_Output_NO_Match]
 10: (
 11: 	[CountryCode] INT NULL,
 12: 	[CountryName] VARCHAR(100) NULL,
 13: )
 14: GO
11. Data Mapping for Result table:
Now I need to provide mapping for the Output Table to store the Matched Data.

SQLYoga - Lookup Transformation Task #9
12. Complete Data Flow for Lookup Transformation:

In order to handle the “Not Matched data”, provide the link of Not Matched data to OLEDB Destination table “LKP_Output_No_Match”. It will store the not matched results.

SQLYoga - Lookup Transformation Task #10 
13. Package Execution and Result:

Now let’s execute the Package and check with the output inside the tables we have created to store the result as in
LKP_Output_Match” table for Matched Data and “LKP_Output_NO_Match” table for Not Matched Data.


  1: SELECT * FROM LKP_Output_Match
  2: 
  3: SELECT * FROM LKP_Output_NO_Match
  4: GO
Result :
 
SQLYoga - Lookup Transformation Task #11Lookup Transformation can be used in various ways according to the requirements and can be implemented accordingly. This was just an understanding document for Lookup Transformation.

Reference: Tejas Shah (www.SQLYoga.com)

March 28, 2014

T-SQL to find out Space by File Group

T-SQL script to find out SQL SERVER FileGroup space details:
SELECT
 fg.groupname AS 'File Group',
 Name, 
 CONVERT (Decimal(15,2),ROUND(f.Size/128.000,2)) 
  [Currently Allocated Space (MB)],
 CONVERT (Decimal(15,2),
 ROUND(FILEPROPERTY(f.Name,'SpaceUsed')/128.000,2))
  AS [Space Used (MB)],
 CONVERT (Decimal(15,2),
 ROUND((f.Size-FILEPROPERTY(f.Name,'SpaceUsed'))/128.000,2))
  AS [Available Space (MB)]
FROM dbo.sysfiles f (NOLOCK)
JOIN sysfilegroups fg (NOLOCK) ON f.groupid = fg.groupid
ORDER BY fg.groupname

September 4, 2013

SQL SERVER: T-SQL to read all attributes of the XML

We have reviewed how to read XML with SQL server in this post, and how to read XML having Namespace in this post. Today I would like to share how to read all attributes of the XML and have result in Table format.

Recently, my team assigned one requirement to import XML file. To read an XML file is achieved by following this post, but here, we have a requirement to read each Attribute name and Attribute Value to store that info in normalized table by comparing Attribute Name. Let me explain with example.

XML:
<SQLYoga 
       name="Tejas Shah" 
       companyname="SQLYoga" 
       expertise="SSIS, SSRS, T-SQL">
</SQLYoga>

Expected Result:

SQLYoga Read All Attributes of the XML with T-SQL #1


Query to achieve the same:

  1: DECLARE @XML AS XML
  2: SELECT @XML ='<SQLYoga 
  3: 	name="Tejas Shah" 
  4: 	companyname="SQLYoga" 
  5: 	expertise="SSIS, SSRS, T-SQL"></SQLYoga>'
  6: SELECT	
  7: 	CAST(x.v.query('local-name(.)') AS VARCHAR(100)) As AttributeName,
  8: 	x.v.value('.','VARCHAR(100)') AttributeValue
  9: FROM @XML.nodes('//@*') x(v)
 10: ORDER BY AttributeName
Here, We have used “@*”, which gives us way to read the attributes for the ROOT tag and “local-name” gives is the Attribute name. That’s it.

Reference: Tejas Shah (
www.SQLYoga.com)

July 8, 2013

SQL Yoga: Parse XML with namespace with SQL SERVER

Recently, we were working on XML having namespace and we need to read that XML to fetch the information from the XML.

Here, is the Sample XML which we are trying to parse:

DECLARE @x XML
SELECT @x = '
    <X
                xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1">
        <info>Sample XML TEST</info>       
    </X>           
  '

If we need to fetch the “info” element with T-SQL we can write query as follows:

;WITH XMLNAMESPACES(DEFAULT 'http://schemas.microsoft.com/search/local/ws/rest/v1')
    SELECT  x.v.value('info[1]','VARCHAR(50)')
    FROM    @x.nodes('/X') x (v)

Here, we have to use “WITH XMLNAMESPACES”, so we can specify the namespace of the XML that we are reading, else it wont give us any result.

SQLYoa Parse XML
         

May 6, 2013

SQL Yoga – Read XML with T-SQL statement in SQL SERVER

In my previous post, “XML Result sets with SQL Server”, we review to generate result sets in XML from SQL server. Then I got a comment from the team, to also have post to read XML in SQL Server.

To read XML in SQL server, is also simple. Lets read the XML which is created by XML PATH in
previous post.

Read XML Elements with T-SQL:

DECLARE @SQLYoga TABLE(
    ID INT IDENTITY,
    Data VARCHAR(50),
    CreatedDate DATETIME DEFAULT(GETDATE()),
    CreatedBy INT
    )
INSERT INTO @SQLYoga(Data)
SELECT 'SQLYoga'
UNION ALL
SELECT 'Tejas Shah'
UNION ALL
SELECT 'Generate XML'

DECLARE @xml XML
SELECT @xml = (
        SELECT    *
        FROM    @SQLYoga
        FOR XML PATH('Record'), ROOT('Records')
    )
SELECT @xml

SQLYoga Resultset of XML PATH

Now, please find query to read the query to read XML generated above:

SELECT

        x.v.value('ID[1]', 'INT') AS ID,
        x.v.value('Data[1]', 'VARCHAR(50)') As Data,
        x.v.value('CreatedDate[1]', 'DATETIME') AS CreatedDate
FROM    @xml.nodes('/Records/Record') x(v)

This query generates the output as follows:

SQLYoga Read XML with T-SQL

That’s it. It is much simple and you can get rid of the complex coding in application. Let me know your comments or issues you are facing while working on this.

Reference: Tejas Shah (www.SQLYoga.com)

May 3, 2013

SQLYoga – XML Result sets with SQL Server

Recently, found that people making much complex code to have an XML in application. I have found that they return result set in Table to the application and have applied code to convert that data table to XML by Asp. Net application. When I review the process, I have suggested that why don’t you use the XML feature of the application. They surprised the simple use of it and make their life easy. Let me have sample query to generate XML result sets in SQL Server database.

There are two ways to have an XML from query:
1. FOR XML RAW
2. FOR XML PATH

Please find following sample queries where both of the way are being used:

Lets create sample data

DECLARE @SQLYoga TABLE(
    ID INT IDENTITY,
    Data VARCHAR(50),
    CreatedDate DATETIME DEFAULT(GETDATE()),
    CreatedBy INT
    )
INSERT INTO @SQLYoga(Data)
SELECT 'SQLYoga'
UNION ALL
SELECT 'Tejas Shah'
UNION ALL
SELECT 'Generate XML'

Generate XML with XML ROW

SELECT *
FROM @SQLYoga
FOR XML RAW, ROOT('Records')

SQLYoga Resultset of XML RAW

XML RAW, returns the XML by each record with row element and all columns as attributes in the XML.

Generate XML with XML PATH

SELECT    *
FROM    @SQLYoga
FOR XML PATH('Record'), ROOT('Records')

SQLYoga Resultset of XML PATH

XML PATH, returns the XML by each record with separate element and also columns as element within respected element of the record.

That’s it. It is much simple and you can get rid of the complex coding in application. Let me know your comments or issues you are facing while working on this.

Reference: Tejas Shah (www.SQLYoga.com)

December 24, 2012

SQL SERVER: Cross tab Query with Pivot (Dynamic Pivot)

As I mentioned earlier, PIVOT in my earlier post, where we have reviewed the PIVOT with static columns. In practical world, we may need to have dynamic columns that needs to be displayed as below:

SQLYoga Dynamic PIVOT with SQL SERVER
As Mentioned in Image, Here, dates are dynamic, based on Order Date, 10/25,10/26,10/27,10/28 etc..To achieve the same with Dynamic columns please find query for the same:

CREATE TABLE #TestTable(ctrl_no INT, id int, Amount NUMERIC(18,2), OrderDate DATE)

INSERT INTO #TestTable(ctrl_no , id , Amount , OrderDate )
SELECT 1000029,100016,990.0000,'10/25/2012'
UNION ALL
SELECT 1000029,100018,990.0000,'10/25/2012'
UNION ALL
SELECT 1000029,100016,660.0000,'10/26/2012'
UNION ALL
SELECT 1000029,100018,660.0000,'10/26/2012'
UNION ALL
SELECT 1000029,206007,660.0000,'10/26/2012'
UNION ALL
SELECT 1000029,206007,660.0000,'10/27/2012'

 

DECLARE @MinDate DATE, @MaxDate DATE
DECLARE @Dates VARCHAR(MAX)
SELECT @Dates= ''
SELECT @MinDate = MIN(OrderDate),
        @MaxDate = MAX(OrderDate)
FROM #TestTable

 

WHILE @MinDate <= @MaxDate
BEGIN
    SELECT @Dates = @Dates + '[' + CONVERT(VARCHAR(10),@MinDate,101) + '],'
    SELECT @MinDate = DATEADD(d,1,@MinDate)
END

SELECT @Dates = LEFT(@Dates,LEN(@Dates)-1)

DECLARE @qry NVARCHAR(MAX)
SELECT @qry = N'

SELECT    id,
        ' + @Dates     + '
FROM (
    SELECT    id,
            ISNULL(Amount,0) AS Amount,
            OrderDate
    FROM #TestTable t
) x
PIVOT(
    SUM(Amount) FOR OrderDate IN('+ @Dates     +')
) p '

EXEC (@qry)

DROP TABLE #TestTable

Let me know if you have any question/comment.

Reference: Tejas Shah (www.SQLYoga.com)