Problem Statement:- I have two tables “Employee” and “ConveyanceDetail”, in first table we have employee id(Eid) and Name of the employee and second table contains employee id(Eid) and the “Cabtype”. If an employee is using the company’s cab there will be an entry of his/her employee id in “ConveyanceDetail” table with the vehicle type. Now I want to perform a lookup operation which will be performed on “Employee” table and reference table will be “ConveyanceDetail”.Eeach employee id of reference table will be matched with employee id of “Employee” table and if there will be a match, it should be written in a text file and non matching rows will be written in a different text file. Below is the SQL script for creating above mentioned tables and inserting few rows into those tables.
Create table Employee(Eid int,Name Varchar(50),Dept varchar(50))
Insert into Employee values(101,’Neeraj Yadav’,’IT’)
Insert into Employee values(102,’Vikas Sharma’,’Finance’)
Insert into Employee values(103,’Amit Saxena’,’HR’)
Insert into Employee values(104,’Rohit Verma’,’Marketing’)
Insert into Employee values(105,’Harsh Pandey’,’Admin’)
Insert into Employee values(106,’Mohit Suri’,’Purchase’)
Create table ConveyanceDetail(Eid int,CabType Varchar(10))
insert into ConveyanceDetail values(102,’Car’)
insert into ConveyanceDetail values(104,’Tempo’)
insert into ConveyanceDetail values(105,’Jeep’)
insert into ConveyanceDetail values(106,’Bus’)
Solution: Lookup is quite similar to joins in T-SQL, lookup perform an equi –join between input table and reference table. To implement a lookup task, we will take one data flow task in control flow region as shown below.
Now we will double click on “data flow” task that will open the “data flow” region and add one “OLEDB Source” in it as shown below.
We will configure the connection manager for “OLEDB Source” and select “ConveyanceDetail” table as shown below.
We will click on “OK” button to save the settings and add one more “data flow” task in Control flow region as shown below.
We will double click on the second “data flow” task that will open the data flow region, there we will add one “OLEDB Source” as shown below.
To configure the “OLEDB connection manager” we will right click on it and specify the server details and select “Employee” table as shown below.
we will click on “OK” button to save the setting that will close the above screen. Now we will add lookup task and link it with the “OLEDB Source” as shown below.
Now we will configure the lookup task by right click on it and clicking on “Edit” menu that will bring a new screen(lookup transformation editor) as shown below.
In the above screen, I have encircled three sections first section is “Cache mode” which is explained below.
Full Cache: if we choose “Full cache” , it means all the rows from the source will be cached before performing the lookup transformation.
Partial Cache: If we choose “Partial cache”, it means data set will be cached while performing the lookup transformation.
No Cache: It means there will be no caching mechanism.
In second section, we define the connection type whether we are using “OLEDB connection manager” or “Cache connection manager”. In our case we are using “OLEDB connection manager”.
In third section, we will select option “redirect rows to no match output” as we want to move non-matching rows to an output file.
Now we will select the “Connection” tab (which looks like as shown below) and there we will specify the lookup table(ConveyanceDetail).
Then we will select the “Columns” tab and map the Eid column of input dataset(populated from Employee table) with the Eid column of lookup dataset(populated from ConveyanceDetail table) by dragging and dropping the eid column of input dataset to lookup eid column(Shown below).
We will click on “OK” button to save the settings that will close the above screen. Now we will create to text files with name “MatchingRows.txt” and “NonMatchingRows.txt” at location “C:\data”.
Lets move further and add one “Flat File Destination” task and link it with the lookup task, as soon as we link the lookup task with “Flat File Destination” task a pop up window will appear as shown below.
In above window, we will select “Lookup Match Output” option and click on “OK” button to save the settings. Now we will double click on the “Flat File Destination” task that will open a new screen (Flat file Destination editor) where we will define the connection manager by clicking on the “New” button as shown below.
As soon as we click on the “New” button a new window will appear as shown below.
We will select the option “Delimited” in the above window and click on “OK” button that will disappear the above window and open a new window as shown below.
In the above window, we will specify the flat file location where data will be copied , here I have specified the “MatchingRows.txt” file. Now we will select the “Columns” tab (upper left side in the above window) that will transform the above screen in a new look as shown below.
As you can see in the above screen Eid column is appearing two times , we have to delete one column otherwise we will get error message for duplicate column name. Reason for duplicate column is that while configuring the lookup task we chose the option “add as new column”(as shown below).
If we had selected “replace Eid” option , we would have not get the duplicate column. Now we will add one more “Flat File destination” task and connect the lookup task with it. This time we will specify the second file name “NonMatchingRows.txt” in the file connection manager editor. Now we are ready with our package.
We will run the package by pressing “F5” key , lets see if it runs successfully.
Green color of package is indicating that our package ran successfully. I hope I was been able to explain lookup transformation in detail. Next time, I will be back with one more SSIS topic till then take care.
704 total views, 2 views today