Archive

Posts Tagged ‘fact table’

SSAS #20 – When Analysis Services meet Integration Services

I recently designed a cube in Analysis Services from scratch. After some brief research on all the different ways to process Analysis Services objects, some of the questions I had finally have a clear answer.

If you have being working on Microsoft’s BI platform, you would know that SSAS and SSIS actually meet quite often. Did you design SSIS packages for fact table ETL and dimension ETL? Yes, you did.

The real integration points between SSIS and SSAS, though, involve SSAS cubes, where we need SSIS to process the database sources into the OLAP cube structure.

A big portion of SSAS integration with SSIS involves processing data and managing measure group partitions. In SSIS, Microsoft has provided us with some out-of-box basic cube processing capabilities, and also tools for handling more complicated situations. I’ll show you how I used the out-of-box basic tasks in SSIS to process my cube.

Before I jump into the details, I want to ask you two questions. One, what are some of the basic objects in SSAS? Two, what are the main types of SSAS objects that must be processed? I’ll blog about these two questions later.

3 Out-of-Box methods to process SSAS Objects in SSIS

1. Using the control flow object Analysis Services Processing Task

To me, this is the most straightforward approach. This is the method I used. I’ll show you how I used it in this post. You will see the drawbacks of this approach from the example I will show you.

image

2. Using the control flow object Analysis Services Execute DDL Task     

This object’s functionality goes beyond just Data Definition Language (DDL). It can run an SSAS XMLA script, which can not only run DDL, but also query and process. Because XMLA script can be modified in SSIS package before it is executed, this object is actually very powerful in terms of processing SSAS objects dynamically (without drawbacks of hard-coding the SSAS objects).

3. Two data flow destinations: Dimension processing destination and Partition processing destination

These two objects allow data directly from the pipeline to be pushed into a dimension or a partition. This is the only method where data is pushed directly to SSA. The above two methods essentially tell SSA to start pulling data from a data source      

Use control flow object Analysis Services Processing Task to process SSAS Objects

It’s pretty straightforward with this approach. I’ll blog about it in my next post.

Introduction to Employee Compensation and Productivity Reporting

March 16, 2011 Leave a comment

I came back to my old job, but am not working on exactly same projects. I am very happy that I have new puzzles to solve now.

Obviously employee compensation and productivity reporting has become a hot topic in the business recently.

I am going to blog on this subject as I get to understand more about the business needs.

After completing several ETL processes and reports so far, I gradually start to put all the puzzles together.

In its simplistic form, I can envision a subset of the data warehouse bus matrix. This matrix will revolve as I understand more.

image

The dimensions will involve hierarchies. All the facts basically are from a transactional table. But as we put together all the fact tables for reporting, we will get all three types of fact tables, i.e. the transactional fact tables, snapshot fact tables, and also accumulating fact tables. The third type will be the most common.

The natural granularity for our business transactional fact table will be one row for each action a team member does for each account. For the purpose of employee compensation and productivity reporting, the facts associated with the business process typically just include the counts of the actions, such as the counts all “declines”, or the counts of all “approvals”.

In the next blog I’ll talk about whether I need to normalize my fact tables by fact types.  

Define the grain of fact tables with unique key constraints

March 13, 2011 Leave a comment

I have been working with SQL Server, and Integration Services and Reporting Services for years. Data from our data warehouse in DB2 are ETL’d to the SQL server to be ready to be consumed by SSRS report. Until recently, I’ve only been using unique key constrains on the fact tables sparsely. I’ve started to do so on all the fact tables, and dimension tables now.

Out of three basic fact tables, transactional fact table, fact table for periodic snapshots, and for accumulating snapshots, we very often deal with the first type of transactional fact table, which holds data at the most detailed level. Generally speaking, this type of fact table will have the most number of dimensions associated with it. Fortunately, it’s not that many, in the type of fact tables I deal with. The most common grain of the fact tables I deal with involves account number, transaction date, employee login code, type of product, type of actions (this can grow to a few sometimes). If it’s an accumulating fact table, several milestone dates can also be part of the grain.

OK, back to the creation of the unique key constraints. Most of the unique key constraints will just be a composite key that is composed of all the dimensions in a grain. The grain of a fact table represents the most atomic level by which the facts may be uniquely defined. There might be also some dimensions (such as employee’s location) that do not contribute to the uniqueness of the fact records. Do not include those dimensions in the unique composite key.

%d bloggers like this: