SSIS #105 – Loop through rows in a SQL table (with multiple columns)
I am a SQL purist, meaning that I have vowed never to use cursors in T-SQL to loop through rows record by record.
In SSIS package design, however, I have used the Foreach Loop Container very often. I would not say that I have broken my vows though for the following reasons:
1. The record set I loop through is usually relatively small.
2. Foreach Loop Container in Integration Services takes a data type of System.Object (not the physical SQL table), which is a in-memory object that is native to the Integration Services. So I trust that Microsoft engineers have done a wonderful job in terms of optimal performance.
In Integration Services, I have used two ways to generate this System.Object for the Foreach Loop Container to consume.
One way is to use an Execute SQL Task. I have blogged about this way in the past.
- SSIS – “Full result set” to a set variable of Object.
- SSIS – Use “Foreach Loop Container” and pass a variable to a OLE DB Source in a Data Flow
- SSIS #99 – Pass a date time parameter to the Execute SQL Task
Here are two screen shots that shows how to configure the Execute SQL Task to receive the query result into a variable with data type of System.Object.
Although you can map the query result to more than one System.Object variables, this would not work for the problem I am facing now.
1. Here is a pseudo query that I need to retrieve a collection of new QueueName and their parent IDs.
2. I want to use both the QueueName and the QueueParentId, and pass them through a Foreach Loop Container to another task and use them as query parameters.
For the 2nd challenge, where more than one columns must be split up from one single System.Object, I cannot use Execute SQL Task with a Full result set configuration.
Recordset Destination comes to the rescue
The ADO recordset fits perfectly for my problem.
A Design Pattern (Loop through rows with multiple columns) with Recordset Destination and Foreach Loop Container
1. First, I need three package level variables. vLoopSet is the System.Object that will store the ADO dataset. The other two non-System.Object variables are for each column in the ADO recordset.
2. Here is the Control Flow I will need. The first Data Flow will run some query against my data source and send the results to a Recordset Destination.
3. Here is the Data Flow.
Configuration the Recordset Destination is simple enough. All the interface on top is useless. The only thing you need is to set the VariableName.
And also the Output columns.
4. For the Foreach Loop Container, select Foreach ADO Enumerator, User::vLoopSet, and Rows in the first table.
For the variable mapping, use the zero-based index to map the columns to the variable. Index zero will be the first column in the query, and 1 is the second column in the query.
5. Finally, we are ready to use the User::vQueryName and User::vQueueParentID as dynamic parameters for our task inside the Foreach Loop Container.
Here we are using the famous question mark again to represent our dynamic parameters. Note: the ? is position sensitive, meaning that the first ? will represent the first variable from the above mapping, and the second ? represents the 2nd variable in the above mapping in the Foreach Loop Container configuration.
I hope after reading this post, you will rush off to to open your own SSIS packages and find opportunities to use this Loop Through Rows with Multiple Columns pattern.