I have a blog, SSIS #108 – The Makeover of the Lookup Data Flow Component in SSIS 2008, a few months ago. Because of the two improvements of the lookup data flow component in SSIS 2008, the cache mode and the redirecting rows to no match output, I find myself giving more and more special favor to the lookup component in loading lookup tables, mapping tables, and dimension tables.
This comparison part in the lookup data flow component is case sensitive
In SSIS, things are generally case-sensitive. Variable name is one. If you have a variable named varRecipients, but you used varRecipientS in an expression, you will get a clear message that varRecipientS cannot be found.
What does the case-sensitivity have to do with the lookup data flow component? Quite a bit. The main part of the lookup component is to compare certain data elements between the source and the destination. This comparison part is case sensitive (There is an exception to this statement. See the last part of the blog.).
Take care to ensure that the comparison in the lookup data flow component is also case insensitive
SQL servers I’ve worked with are all configured to use case-insensitive collation. In most data warehouse design, case shouldn’t matter. An existing record of Courtesy Waiver in a dimension table should prevent “Courtesy waiver” being added to the same dimension table.
If your SQL server is using case-insensitive collation and your data warehouse is also case insensitive, then you should take care to ensure that the comparison in the lookup data flow component is also case insensitive. Otherwise, you will end up with either
- a duplicate record of “Courtesy Waiver “ and “Courtesy waiver “ being inserted into your dimension table, or
- your lookup data flow component will fail during runtime if you have taken care to create an alternate key on your dimension table.
Since I always have alternate (unique) key defined on my dimension tables, the second scenarios will happen if I hadn’t done the following steps.
Use Upper (or Lower) Function for comparison, but preserve the case for data to be inserted
In the OLE DB Source, I will need two columns, one for comparison, and one for inserting into my lookup table (if it has not existed yet). The one for comparison Reason_Compare will need the Upper (or Lower function will do too). The one that will be potentially inserted into my destination table needs to preserve the case, so no Upper function is used.
In the Lookup Transformation Editor, also use the Upper function to query from the destination lookup table.
And also make sure to use the Reason_Compare column for comparison.
Finally in the In the OLE DB Destination Editor, the mappings are done without the Reason_Compare column.
Partial cache or No cache result in the comparison being done directly on the SQL Server which would result in case-insensitive comparison
This blog will not be complete if I don’t mention the Cache mode. There are several blogs mentioning that if you change the cache mode to Partial cache or No cache, the comparison will be done directly on the SQL Server which would result in case-insensitive comparison (provided the SQL Server uses case-insensitive collation, which is the default setting of SQL Server).
Here are the links to a couple of them:
- Jorg Klein : SSIS – Lookup is case sensitive
- Case insensitive comparison in SSIS 2008 LookUp transformation
I’ve always chosen to use the Upper function for comparison and preserve the case for data for two good obvious reasons.
- I like Full cache and
- As a developer, how the servers are configured is totally out of my control
- Enjoy using the lookup data flow transformation!
Expand or collapse the inner most row group is a common practice
If you have ever created Row Groups in reporting Services, you must also have tried to give users a chance to expand or collapse the inner most row group, such as the Details group.
I believe this is the correct way to use the toggle images
When the toggle item (the Textbox for Interval in the example) has the (-) toggle image, the Details row group is showing all the rows.
When the toggle item (the Textbox for Interval in the example) has the (+) toggle image, the Details row group is now hidden.
I believe the above two screen shots are using the two toggle images correctly.
But I have seen too many reports that use the two toggle images differently.
Here is an example.
Using the (+) image, but the details are showing.
Using the (-) image, but there are no detail rows.
Use two properties: InitialToggleState and Hidden
If you agree with me that the first two screen shots are using the toggle images correctly, then here are two properties you can use to set it up in Reporting Services 2008.
To show the inner row group initially
- Toggle item property: InitialToggleState = True
- Inner row group property: Hidden = False
To hide the inner row group initially
- Toggle item property: InitialToggleState = False
- Inner row group property: Hidden = True
Here are the steps to show the inner row group initially.
1. Click on the toggle item (i.e. a textbox in the outer row group) so the textbox is in focus.
Press function key F4 to get to the Properties. Then set the property InitialToggleState to True.
(note sure if the message here from Microsoft engineers is correct)
2. Click on the Details row group(i.e. the inner row group you want to hide or display by users) so the row group is in focus.
Press function key F4 to get to the Properties. Then set the property Hidden = False.
I am no expert on the subject of Visual Business Intelligence. As a matter of fact, I often struggle with the choices of presentation styles, colors, and other visual effects in my Reporting Services report designs.
While attending an online seminar from Pragmatic Works, the presenter recommended a classic book in visual BI, Show Me the Numbers (http://www.amazon.com/Show-Me-Numbers-Designing-Enlighten/dp/0970601999) by Stephen Few. There are many visualization tips in the book.
I do not own the book yet. But I guess I should rush to Amazon to get the book soon or later.
I do not want to offer any visualization tips since I am not qualified at all. Here are just a few resources I used specifically for choices of colors.
Value field is needed in parameter pick list or drill-through hyperlinks
When working with Reporting Services, we often provide users with parameter pick list such as this one:
We also provide users with hyperlinks so users can drill through to see more details.
In Reporting Services, your pick list can have a Value field, and a Label field.
The label field is what is shown in the lick list, while the value field is what will be passed to your parameterized report.
In the case of hyperlinks, you will need to pick one value field from the dataset to pass to the detail sub-report that will takes the value as a report parameter.
So what are the available choices for this Value field?
If you use the default parameter pick list from the MDX Query Designer without modification, you will most likely end up with:
- Member property MEMBER_CAPTION will end up as the Label field, and
- .UniqueName function will end up as the Value field
Member Property: Unique_Name and MEMBER_KEY
The member caption is a good choice to show in the pick list, but the fully qualified Unique Name which includes the unique key value should be the choice to pass as the parameter.
However, there are cases where I only need the key value without the Dimension.Hierarchy qualifier. This is
- either because I have created my sub-report to take only the key value, or
- I need to qualify the value with different Dimension.Hierarchy qualifier.
- To fully understand the different member properties and the related member functions, I created this MDX query to show different member properties:
A few notes on this query:
- Most member properties have equivalent member functions, such as MEMBER_CAPTION and MEMBER_KEY and UNQIUE_NAME.
- DIMENSION PROERTIES clause will not cause the properties to display in the pivot result in SQL Management Studio, and it will not cause them to become separate columns either in the Reporting Services.
- In SQL Management Studio, you can double-click any row cell to see the member properties.
The result of the above query is:
Now that MEMBER_KEY is what I am looking for for some of my sub-reports.
If cube processing allows errors and missing keys, automatic notification process should be in place to check the data counts in the cube against the data counts in the fact table.
Although I’ve known that we can create MSOLAP linked server, and send pass-through MDX queries to cubes using OpenQuery through the MSOLAP linked server, I’ve never found a compelling business case where I absolutely must use it. That is until recently when I started to think about comparing data counts in the cube to the data counts in the fact table in automatic mode.If the cube processing allows errors and missing keys, the comparison becomes very crucial.See my previous blog SSAS 26# – “Ignore Error” when processing cube. An periodic automatic notification process should be in place to check the data counts in the cube against the data counts in the fact table.
The MSOLAP linked server allows pass-through MDX queries sent to cube.
I had an opportunity of a MDX training recently. Our trainer Mr. Chester (http://www.virsolutions.com/) showed some examples of the pass-through MDX queries.
This query is simple enough, but it opens a door to an automatic process that I’ve been looking for.
Enable the "Allow inprocess" flag in MSOLAP Provider Options
Creating a linked server is pretty simple. Here is the sp_addlinkedserver procedure I used:
However, the first try of the OpenQuery failed with a message like this:
Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "MSOLAP" for linked server "Cube_Link" reported an error. Access denied.
Msg 7350, Level 16, State 2, Line 1
Cannot get the column information from OLE DB provider "MSOLAP" for linked server "Cube_Link".
Again Mr. Chester came to the rescue and sent me this link:
In the above blog, the author pointed out that the “Allow inprocess” must be enabled.
To enable the “Allow inprocess” flag, follow this path in SSMS:
SQL Server Management Studio/Server Objects / Linked Servers / Providers / MSOLAP, right-click and edit Properties:
Pass-through MDX queries return column names as [Dimension].[hierarchy].[level].[property]
The column names returned back from the pass-through MED queries are long and fully qualified. There are many ways to change them into cleaner name. To change them into shorter names that are the same as the column names on the fact table in a fast way, I used WITH statement to define column names.
Now data validation between source, staging, fact and cube can be automated in TSQL
Now we are back to the TSQL land, and anything is possible.