SSAS has two major functions:
1. a high-performance cubing technology, and
2. a data-mining technology
Here are the steps you can take to create an SSAS database using BIDS.
- 1. Create an SSAS project in BIDS.
- 2. Create a data source (DS) that connects to your data warehouse in star schema.
- 3. Create a data source view (DSV) to act as a layer between your data source and the data warehouse and to protect you from changes.
- 4. Run the Cue Wizard to create the first cube and dimension.
- 5. Finally, modify the cube and dimensions to scale better and be properly formatted.
A data warehouse is made of fact and dimension tables.
- Fact tables: hold measures, which are the numeric columns that you want to put on the report.
- Dimension tables: hold the columns in which you want to filter the data.
SSAS adopts many of the same items but has changed their names.
- Measure: individual column in fact table
- Measure Group: in SSAS measure groups are the collections of measures within a single fact table.
- Attributes: in SSAS attributes are the columns in dimensions.
- Members: unique values in attribute; When we process a dimension, queries against the underlying dimension table in the data warehouse to retrieve all the distinct values on an attribute-by-attribute basis. These distinct values are called members.
- User Hierarchy: If users consistently view the data in a given way in a dimension, we can create a user hierarchy. These hierarchies give the user the means to find the data quickly. Most date dimensions have a hierarchy for Year –> Quarter –> Month –> Date. Creating the hierarchy allows the user to drag over a single element that contains all those attributes.
- Attribute Relationships: Attribute relationships tie multiple attributes in a single SSAS dimension together, much as foreign keys tie various tables together in SQL Server. This helps with the cube’s performance in query time and processing time. It also helps with security and user experience.
- Cube: interceptions of dimensions; all aggregations are pre-stored in cube.
Assuming that now we’ve loaded our data warehouse. The rest of my notes are all about SSAS.
- SSAS is a service that can pre-aggregate data and make it faster to query data. According to the workshop presenter, it took only 2 seconds to query a 24 terabytes data (in grocery industry)
- Reduce report backlogs because SSAS allows your users to create complex ad hoc reports without having to involve IT.
- With SSAS, users can create report via familiar interfaces like Excel or Reporting Services with drag & drop.
- With SSAS, all the reporting logics are centralized.
- From a developer’s perspective, SSAS really means that no more bad SQL queries, no more waiting for 2 minutes for a report.
This is my second notes on fact table modeling. They are tips to improve performance and scalability.
Fact tables are often designed to be index light, meaning that indexes should be placed only to support reporting and cube processing.
Fact tables are often much larger in row count and data volume than your dimensions.
- Implementing table partitioning: table partitioning is very effective for this type of large fact tables. Sliding-window partition scheme, where you roll off the old partition and roll on new ones periodically, can drive the IO and query times down. Data loading and cube processing will also be faster since the new version of SQL Server 2008 and 2008R2 allow for increased parallelization of queries across partitions.
- Configuring and implementing data compression in SQL Server 2008 and 2008R2
- Using physical FK constraints to enforce relationship. There will be minimum amount of overhead in data loading and heavy-duty reporting. But for this minimum overhead, Analysis Services can optimize its processing based on the fact that it can assume these keys are keeping the data valid and disable some internal checks built into the processing algorithm. This will allow Analysis Services to consume the data from the fact and dimension tables much faster than if it didn’t have those keys.
Finally here is my notes about fact table modeling.
- A fact table is the center of the star schema
- It contains two primary types of data: Key columns and Data columns.
- Fact table holds the keys to relate the needed dimension tables
- Many OLAP tools, such as Analysis Services, look for star schema model and are optimized to work with it. So the fact table is a critical piece of the puzzle.
Process of designing fact table:
- Decide on the data to analyze
- Pick the level of granularity. Actually your choice of dimension tables has largely determined the granularity of your fact table.
- Decide how your will load the fact table. A transactional table will load transactions at intervals from OLTP system. A snapshot fact table will load all the rows of inventory/snapshot-style data from the day, and allow users to see the current status and information based on the date the information was loaded.
This is my last notes on dimension modeling.
- Avoid unnecessary normalizing. In a star schema, the practice of normalizing is called snowflaking. This is usually unnecessary, unless your dimension tables is so large it needs to be physically separated on the storage array.
- CamelCasing, the capitalization of each word with no spaces or underscores. CamelCasing will help streamline your work in SSAS.
- Begin your table with Dim. This allows for better sorting of dimension and fact tables.
- Don’t over index. Until you know what queries you’ll be running against the tables, don’t assume you need indexes on all the columns.