I came across Vincent Rainardi’s blog, and thought I’d share it with the readers.
I’d also like to thank Vincent for writing up the review so quickly.
I browsed through Vincent’s blog, and found that he authored this book:
Many reviews praised that it is an excellent book on data warehousing that combines simple and practical explanations and "how to" for all the key aspects of building a data warehouse on SQL Server.
When tuning SQL queries, I find that I use the following three SET options quite often.
set showplan_xml on
When showplan_xml option is ON for a session, the execution of SQL queries in the same session will only show the estimated execution plan. No query results will be generated.
Here is an example. The query is executed in AdventureWorksDW2012.
What you will get is the XML presentation of the estimated execution plan.
Most often, I’d just click on the XML link, and look at the graphical version of the execution plan.
Once in the graphical editor of the XML execution plan, I can right-click on the surface, and have options to either save the execution plan as a *.sqlplan file, or to show the actual XML code in the XML editor. The .sqlplan file can be attached to emails and sent to anybody who can help with the tuning. SQL Server Management Studio can read .sqlplan file in both graphical mode and XML mode.
This is part of the XML code. Although most people prefer to read execution plans in he graphical mode, the XML version can be handy if you want to do some text searching.
set statistics xml on
When statistics xml option is ON for a session, the execution of SQL queries in the same session will show query results and the actual execution plan.
Now try to set the statistics xml on and execution the query shown in the following screenshot.
Here are the query result and also the actual execution plan you will get.
Since I prefer coding everything in T-SQL, using the SET commands is my natural choice. If you prefer using the graphical tools, you can access both options in SQL Server Management Studio. From the menu Query, you can toggle the options on/off.
set statistics io on
When statistics io option is ON for a session, the execution of SQL queries in the same session will show query results and in the Message tab, it will show the IO statistics.
Using the same query, here is the IO stats you will see.
SQL query tuning is more than just knowing how to get the execution plans, but it’s a start. Enjoy reading the execution plan.
A co-worker recently asked me why she would get timeout when updating a column in a big table. This column will be updated to 0 (zero) because a default value of 0 (zero) constrain is now added to the table.
The table has over 40 million records. Although I didn’t ask the exact size in MB, I am pretty sure that a table with over 40 million records is not a small table anymore.
I had a blog a few weeks ago that talked about 4 different scenarios when deleting or updating data on a large table.
I didn’t refer my co-worker to the above blog because the blog is really too long to read in one minute and many scenarios are not applicable to the issue. Scenario # 3 in the above blog “Updating a Nonpartitioned Table – Switch with a Temp Table” is the closest scenario to the issue my co-worker is trying to resolve.
So I summarized a few steps for my co-workers. When we try to “update” a big table, we would actually avoid using the UPDATE command directly on the main table. We want to take advantage of the bulk load optimizations in T-SQL with the INSERT … SELECT method or the SELECT … INTO method.
Update a large table
“Updating”a large table may be time-consuming.
Here is a summary of steps that use the INSERT … SELECT method:
1. Create a temporary table with the same table schema.
2. Copy all the data from the actual table to the temporary table using INSERT … SELECT method. Make sure that the column that needs to be updated to 0 (zero) has value 0 (zero) in the SELECT clause.
3. Drop the actual table. If the large table has constraints and indexes, make sure that you have the scripts to re-create them.
4. Rename the temporary table to the name of the actual table. Recreate any constraints and indexes.
I like the SELECT … INTO method. So here are the steps:
1. Create a temporary table with the same table schema using the SELECT … INTO method. The data is also copied to the temporary table at the same table the temporary table is created. Make sure that the column that needs to be updated to 0 (zero) has value 0 (zero) in the SELECT clause.
2. Truncate the actual table.
3. Use INSERT … SELECT method to insert the data from the temporary table to the actual table.
4. Drop the temporary table
My-coworker took my suggestion and eventually “updated” the column with all 0 (zero) value.
But it puzzled me that how she got the timeout error. “Updating” a big table might take a long time, but where is the timeout from?
Then it dawned on me that she is not “updating” the table using T-SQL, rather modifying the big table by using the table designer in SQL Server Management Studio.
Then I found this Microsoft article.
Although I prefer coding everything in T-SQL, there are programmers who use SQL graphical tools instead. Information from this article can be helpful.
Here is a summary from the article.
When you try to modify a large table by using the table designer in Microsoft SQL Server Management Studio, you may receive an error message that is similar to the following:
Unable to modify table.
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
This behavior occurs because of the transaction time-out setting for the table designer and for the database designer in SQL Server Management Studio. By default, this setting is 30 seconds.
From the Tools à Options à Designers à Table and Database Designers, you can change this setting ‘Transaction time-out after’ to a higher number.
Notice that this setting differs from the setting in the Execution time-out box in SQL Server Management Studio. By default, the setting in the Execution time-out box for Query Editor in SQL Server Management Studio is zero. Therefore, Query Editor waits infinitely for the query to finish and never times out.
To resolve this by-design behavior, use one of the following methods:
- Click to clear the Override connection string time-out value for table designer updates check box for the table designer and for the database designer in SQL Server Management Studio. See the option in the first screenshot.
- Specify a high setting in the Transaction time-out after box for the table designer and for the database designer in SQL Server Management Studio. Also see the option in the first screenshot.
- Modify the large table by using Transact-SQL statements in Query Editor in SQL Server Management Studio. This is my recommendation to my co-worker. Make sure you don’t use the UPDATE statement directly on the large table, use a temporary table and take advantage of the bulk load optimizations in T-SQL with the INSERT … SELECT method or the SELECT … INTO method.
Now I wonder how my co-worker reacted when I sent her the T-SQL script to “update” the table to resolve the time-out issue.