In my previous post, Why truncating a target table is a bad practice in ETL?, I strongly suggest that we do not truncate our target tables.
In this blog, I’ll tell you the worst possible way I’ve seen to truncate a target table.
Before I get into that, I’ll share with you the experience a business user has shared with me:
….Sherry, can you check the report again? It has failed with the new data load today again.
….What caused it to do that? Has been happening sporadically lately where the whole report just goes down.
….I thought the issue has been resolved since last change.
Before the “last change”, the report has no staging data. It’s pulling large amount of data directly from our data source via OPENQUERY (you can read my other posts to know my opinions about OPENQUERY). If you are lucky, the report will send back data within 5 minutes.
So the last change was made is to pull data every morning and store it in a table.
Here is how this table (in my opinion, this is your target table, not a staging table) is populated every morning:
It is truncated first, then with an attempt to populate it. Unfortunately, almost 5 out of 10 days, the OPENQUERY failed, leaving the target table empty 5 days out of 10. In those 5 lucky days, the target table will remain empty for about 10 minutes, which is how long it takes for the table to be fully populated.
That explains the business users’ experience with the report.
Do you still need more convincing not to truncate your target tables?
I recently said yes to a couple of trouble tickets that I could easily say no since I am working on a high priority project. Not only I said yes, I also did something that most developers (contractor or not) would not do, that is, re-designing the entire ETL process and the reporting. I did inform the business users that in order to fix the tiny issue they have reported, I was going to do some fundamental changes (without changing the business logics).
I didn’t realize how much risk I was taking until I got some emails from the business users:
…before the update we could drill down to the account number by clicking on the specific category, where now we have to click on the + sign on each agents name and it puts all categories together; which is much more time consuming…
…the report appears to be missing some March data after you made the changes on Tuesday. I had pulled it earlier in the day on Tuesday and my team had a total of 73 accounts…
…can we get this fixed asap as we are finalizing numbers from March right now. Also you said you could put the report layout back to what it was prior to Tuesday?
I had to deal with over 20 emails to finally make everybody happy again.
I am pretty sure that the new architectural design has improved the maintainability, and performance of the ETL process. There will be less trouble ticket from now on.
But I’ve learned a lesson from this experience. The risk is worth taking, but I should have spent more time with business users to minimize the “discomfort” they would certainly feel after the change.
Should I take risks again?
I recently read an article on SQLCentralServer.com, The Job Posting – Do I really have to be the SQL God? The author was passing on some advices on how to interpret various SQL job postings.
Do you need to be the SQL God/Goddess? You absolutely do, as a SQL consultant/contractor, in order to be a respectable SQL profession and demand a higher rate. But the question is what kind of SQL god/goddess.
The author reviewed a concept that I thought it’s worth mentioning.
For SQL Server, there’s 5 general areas of work: Administration, T-SQL Development, ETL, Reporting, and Architecture. In most of these cases, you end up blending a few of these areas together, but a solid, 40 hr/week position is going to concentrate on one of these and incorporate a little bit of some of the others. For example, reporting will need some T-SQL and some ETL. Architecture requires a basic understanding of the other four. ETL will probably require a little architecture to build staging structures and some T-SQL to optimize some tasks.
Five general areas of SQL server work, blending a few of these areas together. I will say that I agree with this concept. the longer I am working on the SQL server platform, the more blended it has became. It’s a good thing for me, as a contractor, because it means that I’ve been down many different paths and gained experiences in wide range of areas. It’s also a good thing when it comes to job interview.
However, I always struggle with the question of “what title do I put on my resume”. When I was a full-time employee, I accepted the job title the company gave to me. Although essentially I’ve always been in SQL development, the job titles I received varied so much that people would think they are completely different jobs. Here are some of the job titles I had over the years: Data Analyst, Database Architect, Technical Advisor, Database Analyst, System Developer.
Now I am on my own, I no longer have a job title (It’s not entirely true. “Contingent worker” is usually attached to my email in the Global Address List.), until it comes to the resume time. What should I put in that blank space underneath my name? It’s been 6 years, and the situation is getting worse. It is panicky time even when I started to think about that blank space.
Fortunately the companies that hired me as a “Contingent worker” really don’t care about that blank space on my resume.
I’ll take that back. I never left that space blank. Here are some of the words I put in that space:
SQL Developer, Data Analyst, Database Analyst, ETL Developer
Maybe next time, I’ll consider leaving that space blank.