Archive

Archive for April, 2011

SSRS #46 – A case against using character field in multi-value parameters

April 29, 2011 1 comment

In my previous post, Using Multi-value Parameters in SSRS Report, I blogged about how to write a table-valued function in SQL to take a multi-value report parameter. The function ListToTable can take a list such as this:

123,904,949,837,495,399

and convert them into rows in a SQL table, and then your stored procedure can use them in a where clause to limit the results to only what the user wanted.

  1. 123
  2. 904
  3. 949
  4. 837
  5. 495
  6. 399
    In the above example, I used IDs, instead of their text descriptions, with a purpose.
    In some of my SSRS reports, I used text descriptions. It turned out that there are two serious issues with using character field in multi-value parameters.

Issue #1

If the selection list is long, and your users tend to select ALL all the time, you might run into an issue where the list exceeds the maximum length you defined in your stored procedure.

image

Fortunately, you have the SQL keyword MAX to help you to keep your parameters running out of length. So this issue #1 doesn’t seem to be a show stopper.

Issue #2

Issue #2 has to do with the unpredictable nature of our character data. Have you ever noticed characters in your data that are sometimes “unusual”? Of cause you have. But I am talking about some very “usual” characters, such as quotes, single quotes or double quotes. If you are in IT for awhile, you already know that quotes are not just “usual” characters, they are special characters in programming world. They are used to indicate that the data type for your data is character, versus integer, for example. If you need to have quotes in your data, you will need to use “escape” characters to indicate that you want to keep your quotes. Here is where the ListToTable function failed. I did not take the quotes into consideration.

I found out this issue during my testing, noticing that I was missing some data in the report. Later I confirmed that those missing records are the ones that have single quotes in the character field. They failed to return to the SSRS report by the stored procedure.

For the above two reasons, I would suggest that you always use ID fields as your report parameters, where data is much more predictable.

A Side Note

Before I close this post, I would like put another side note here.

If you have some experience with creating reports in Microsoft Reporting Services, you know that you don’t need to use stored procedures as your dataset. You don’t need to use stored procedure to take multi-value report parameters either. So the question is if you write your SQL query in your RDL file to take multi-value parameter, would you run into the issue #2 when the parameter has single quotes in it? That will be a question for you.   

SSRS #45 – There is an error in XML document

April 29, 2011 Leave a comment

I’ve been working lots of hours lately because our project is just about to go into production. One of my ETL processes has gone through many changes recently because business users are now very actively sending us their late minutes requirements.

After making some changes to my ETL process, and re-loading all the historical data, I discovered that one of my SSRS reports was no longer working.

The Issue

Here is the error message from the reporting server (the report has no problem running in BIDS):

There is an error in XML document (1, 6407).
”, hexadecimal value 0x16, is an invalid character. Line 1, position 6407.

clip_image002

 

This is my first time encountering this kind of error. Not knowing the root cause, I spent the next hour or so trying to “fix” it. I tried all the possible ways I could think of, such as trying to find that unprintable character in the RDL/XML file, re-loading the RDL, renaming the RDL file, etc. No luck.

The Root Cause

Next morning, I started searching on Internet, and found many info on this error. The conclusion is that this is caused by a bug on the Reporting Services 2005. A service pack needs to be installed.

But it’s this article from Microsoft that revealed the root cause of the error.   

Error message when you try to browse a report in SQL Server 2005 Reporting Services: "The input XML is not well-formed"

image

This problem occurs because a nonprintable character is populated into a report parameter that you define in the report.

Knowing my data and report very well, it didn’t take me long to find the report parameter that has a nonprintable character in it.

image

The Solution

I would not wait for the architecture team to apply the service pack. The Microsoft support article also gave you workaround in SSRS. I don’t want to do that either. I am a SQL developer so it’s natural for me to resolve it in SQL.

I decided to replace the non printable character by an empty space. Something like this:

REPLACE(myCOLUMNS, CHAR(9), ”)

If you are in the SQL world for awhile, you know that the CHAR function with the right number (which is the ASCII value for the character) will give you special non printable characters.

Some common special non printable characters:

Tab: char(9)
Line feed: char(10)
Carriage return: char(13)

What about the one in my data? What kind of non printable character it is? I tried the above three common ones, no luck.

I need to use another SQL function ASCII to tell me that. It turned that the ASCII value for it is 22.

image

Here comes to my final solution:

REPLACE(myCOLUMNS, CHAR(22), ”)

or I could just simply use:

REPLACE(myCOLUMNS, ”, ”)

Final notes before I close this post:

1. The SQL functions CHAR and ASCII are similar to the Chr and Asc functions within VBScript.

2. If you want to replace ANY non-printable characters in your data, leaving only numbers and printable letters, you would need to do some programming.

SSRS #44 – More on Using Matrix to Grow Data Columns Vertically

April 28, 2011 Leave a comment

In my previous post, SSRS #35 – Use Matrix to Grow Data Columns Vertically and Data Rows Horizontally, I mentioned that you do not need to create any row grouping in order to grow your data columns vertically. What you need to do is to simply add rows (each row represents one column from your SQL table) to the matrix.

One reader asked me how to just add rows. I am trying to make it more clearly in this post.

Here is what the default matrix looks like when you drop it in your data region.

image

There are three noticeable sensitive cells, i.e. Columns, Rows, and Data. If you right click on either Columns or Rows, you will get options that will allow you to manipulate groupings, such as Insert Group, Edit Group etc.

image

If you right click on the Data cell, you will get different options, which allow you to simply add columns or rows (no grouping is required).

image

Bu choosing Add row option, you will be able to add rows to the matrix, with each row represent one column from your SQL table.

image

In this example, I had two rows. I dropped my columns to the Data cell. The above shows the default behavior.

Since I do not need to aggregate my data, I will remove the aggregation function First().

image

Before I close this post, I want to add a side note about the context of this post.

The Context for this post

  • You SQL table holds the “raw” data, meaning that the data is not summarized/aggregated.
  • You don’t want to summarize/aggregate your data for the report.
  • The pair of each matrix row and SQL column is hard-coded, meaning that the number of rows in the matrix will not grow automatically (when your SQL table has new columns).

I recently had a conversation with a co-worker. She needed to modify a report that has about dozen matrixes, and each one shows summarized data, but with the pair of each matrix row and SQL column being hard-coded. Now she needs to add one more measure to each matrix. She is very experienced, and immediately understood that she has two choices here. One is to continue to hard-code the pair of each matrix row and SQL column. Another one is to change both the matrix design and the base SQL table so that the matrix rows will grow dynamically in the future. 

Perhaps I will blog about the choice # 2 in the future.

ETL #70 – The worst possible way to truncate your target table

April 22, 2011 2 comments

In my previous post, Why truncating a target table is a bad practice in ETL?, I strongly suggest that we do not truncate our target tables.

In this blog, I’ll tell you the worst possible way I’ve seen to truncate a target table.

Before I get into that, I’ll share with you the experience a business user has shared with me:

….Sherry, can you check the report again? It has failed with the new data load today again.

….What caused it to do that? Has been happening sporadically lately where the whole report just goes down.

….I thought the issue has been resolved since last change.

Before the “last change”, the report has no staging data. It’s pulling large amount of data directly from our data source via OPENQUERY (you can read my other posts to know my opinions about  OPENQUERY). If you are lucky, the report will send back data within 5 minutes.

So the last change was made is to pull data every morning and store it in a table.

Here is how this table (in my opinion, this is your target table, not a staging table) is populated every morning:

image

It is truncated first, then with an attempt to populate it. Unfortunately, almost 5 out of 10 days, the OPENQUERY failed, leaving the target table empty 5 days out of 10. In those 5 lucky days, the target table will remain empty for about 10 minutes, which is how long it takes for the table to be fully populated.

That explains the business users’ experience with the report.

Do you still need more convincing not to truncate your target tables?

Fun as a Contractor #51 – Are you taking any risk at work?

April 22, 2011 Leave a comment

I recently said yes to a couple of trouble tickets that I could easily say no since I am working on a high priority project. Not only I said yes, I also did something that most developers (contractor or not) would not do, that is, re-designing the entire ETL process and the reporting. I did inform the business users that in order to fix the tiny issue they have reported, I was going to do some fundamental changes (without changing the business logics).

I didn’t realize how much risk I was taking until I got some emails from the business users:

…before the update we could drill down to the account number by clicking on the specific category, where now we have to click on the + sign on each agents name and it puts all categories together; which is much more time consuming…

…the report appears to be missing some March data after you made the changes on Tuesday. I had pulled it earlier in the day on Tuesday and my team had a total of 73 accounts…

…can we get this fixed asap as we are finalizing numbers from March right now. Also you said you could put the report layout back to what it was prior to Tuesday?

I had to deal with over 20 emails to finally make everybody happy again.

I am pretty sure that the new architectural design has improved the maintainability, and performance of the ETL process. There will be less trouble ticket from now on.

But I’ve learned a lesson from this experience. The risk is worth taking, but I should have spent more time with business users to minimize the “discomfort” they would certainly feel after the change.

Should I take risks again?

Categories: Uncategorized Tags: , , ,

T-SQL #47 – Build sequential ranges of dates with propagation to missing values using recursive CTE

April 20, 2011 Leave a comment

Continuing from T-SQL #46.

Here is the second example from beyondrelational.com, their TSQL Challenge #12. I’ve modified their example with the hope that I will be able to come to a solution for the T-SQL challenge I am facing in real life.

The Context

I am working on an employee incentive project. Business wants to encourage the employees’ WIP (Work In Progress)  turns over quickly.  Inventory will be taken each business days to calculate the “age” of each account an employee has been working. Accounts will go through several stages in the WIP, and the account “aging” clock needs to stop at stage 3 since stage 3 is beyond the employee’s control.

The Challenge

Stopping the “aging” clock at stage 3 is actually pretty straightforward as long as we can capture when each account enters stage 3 and exits stage 3. The real challenge arises when I realized that an account can come in and out stage 3 multiple times in its lifetime in the WIP. So to account for this kind of “churning”, for each inventory date, I first set a flag “stage3”, to 1 if an account is in stage 3, or 0 when an account is not in stage 3. The purpose for the flag is so that I can later calculate the gap between stage 3. When I stop the aging clock at stage 3, I still need to add the gap between stage 3 to the age.

You would think that this stage 3 flag will help me to get the gap between stage 3. Yes, it does. But we having missing dates in the inventory dates, since we do not take inventory for holidays and weekends.

So what I need are:

1) Create a new inventory record for each missing inventory date between two valid inventory dates of the table @stage3.
2) For each new inventory date created, I need to recopy the stage3 indicator (1 or 0) from the previous inventory date.
3) Continue the list until the last inventory date on table @stage3 reached for each acct_id.

To make this example simple to understand, I’ll ignore the account number. Again, I put my comments here.

image

Here is the result.

image

I am still not there yet for my real problem, because each account has a different list of stage 3 indicators. I’ll come to that in my next blog.

T-SQL #46 – Generate a date series using a recursive CTE

April 20, 2011 Leave a comment

I recently had a need to use recursive Common Table Expression (CTE)  query to meet a real business requirement.

Before I get into my real business situation, I want to blog about a couple of simple examples of using recursive CTE I found on beyondrelational.com (Google tsql challenges). I don’t who those people are, and how they collaborate on their work of providing those T-SQL challenges and picking winners for best solutions, but they are doing a fantastic job of promoting SET based query writing skills. According to this article T-SQL Challenge #1 on SQLServerCentral.com, TSQL Challenge Committee is a group of SQL Server experts from around the world. Many of them are SQL Server MVPs, some are SQL Server consultants and others are developers working with SQL Server for a number of years.

Anyway, here is the one simple example from beyondrelational.com of generating a date series using a recursive CTE. I put my comments here to make a couple of points.

image   

1) Using UNION ALL and refer to the CTE makes the query recursive.

2) It’s important to know when to terminate the recursive query. The WHERE clause here serves the purpose of terminating the recursive part by setting a upper limit for the date.

If you are new to writing SQL queries, this might not make much sense to you. You might wonder why cannot you use a cursor or a WHILE loop. Well, the purpose here is to write SET based query rather than record-by-record sequential “program”.

Even a simple example like this has real life usage. You can use this date series to do a Cross Joint (Cartesian Product) with a dimension table, such as Product, to create a product catalog.

%d bloggers like this: