Home > SSIS Best Practices > SSIS #108 – The Makeover of the Lookup Data Flow Component in SSIS 2008

SSIS #108 – The Makeover of the Lookup Data Flow Component in SSIS 2008

Lookup data flow component is a very popular design pattern in SSIS. The idea  is to determine whether a record in the pipeline already exists in the intended destination table or not, and then decide the next action.

image

SSIS has many out-of-box components that are in ready-to-use mode. They can be used in many different scenarios with default configurations and achieve good performance with little tuning on our side. The Lookup data flow component is one of them, until you run into a surprising situation where the data flow takes much longer than you have anticipated. Configuration of the Lookup component is simple enough. After you have passed the first step of configuring the component, the main focus is on how to achieve the optimal performance.

The makeover of the Lookup Component in SSIS 2008

As popular as it is, I have not used it a lot in SSIS 2005. In SSIS 2008, however, I was pleasantly surprised by a few improvements.

The new UI make it easier for configuration. The Cache mode now has three different choices. The Connection type now includes a Cache connection manager, and the output now includes a “Redirect rows to no match output”.

image

Cache mode

The concept of caching is not difficult to understand. When using SSIS data flow components, Integration Services load the necessary data into the memory. When not all the data can be loaded into the memory at once, Integration Services load and process data one buffer at a time. See Understanding SSIS Data Flow Buffers on MSDN. The Cache mode tells Integration Services your choice of whether to load and process the lookup data all at once into memory, not at all, or somewhere between. Full cache, in most cases, will give you the optimal performance since all the lookup data will be loaded into the memory.   

Cache connection manager

Cache mode is not really a new concept in SSIS 2008. Cache connection, however, is a major improvement in SSIS 2008. Now, we are able to use a cache for our lookup data prior to the the execution of the dataflow. Cache Connection  is a new connection manager type in SSIS 2008.

With this new type of connection type in SSIS 2008, now we are able to populate our cache many different types of sources, such as text files, raw files, XML files, Excel files, a recordset, or anything that can be accessed using an ADO.Net provider. This is compared to SSIS 2005 where we were limited to use an OLE DB source. This means that the cache can be used in multiple lookups. Imagine when using the Foreach iteration in a loop in SSIS 2005, we would repopulate the cache for every iteration.In SSIS 2008, the cache needs only to be populated once. If you had performance issue with your Lookup components in SSIS 2005, now it’s time to re-examine it. The new cache connection manager might be a life saver.

Creating a new Cache connection is simple. Select New Connection, and then CACHE.

image

image

Redirect rows to no match output

Another improvement is the addition of Redirect rows to no match output. In SSIS 2005, unmatched rows must be redirected as error output. This is a cosmetic improvement, but it does make me smile when I do not need to use the red error output.

In one of my usages of the Lookup component, I would only take the unmatched records,and insert them into the destination table.

image

In the Connection tab, I’d use a query (instead of data from the entire table) to get only the lookup column.

image

The DayID is also the only column I need for output.

image

The new improvements in the Lookup components in SSIS 2008, especially the Cache connection type, makes it using the Lookup component much more fun.

Categories: SSIS Best Practices
  1. February 22, 2012 at 11:31 am

    Nice article, very well explained. In fact, when I moved my clients SSIS packages from SSIS 2005 to 2008 version, lookup cache option helped to improve the performance significantly.

    Like

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: