Archive

Posts Tagged ‘Cartesian Product’

MDX # 40 – CROSS JOIN does not return Cartesian product of two hierarchies from the same dimension

April 15, 2014 2 comments

Correction

4/25/2014

Thanks to Chris Webb’s comment (see the comment section). I am making a correction on this blog.

The title of this blog contains a few important words, “from the same dimension”. You might have guessed the opposite side of it is hierarchies “from different dimensions”.

But my correction is not about this. In trying to explain why CROSS JOIN in MDX does not return Cartesian product of two hierarchies from the same dimension, I originally stated:

With the CROSSJOIN operation in MDX, we get only the combinations that actually have fact data. We now know that out of the 10 colors of Bib-Shorts only the Multi color Bib-Shorts has sales data (fact data).

Unlike the SQL CROSS JOIN, where we had to join one fact table and two dimension tables to find out which color of bib-shorts have sales, in MDX, the CROSSJOIN function automatically took into the consideration of the fact table. I have to say that the twists and turns are for the best. 

I highlighted “fact table” above because that is what made my statement incorrect in the context of two hierarchies from the same dimension. When we crossjoin two sets of members from hierarchies from the same dimension, the fact table(s) does not come into play, rather, it’s the validate combinations from the dimension table(s) that will determine the resulting tuples.  In this case, the Multi color is the only color for the Bib-Shorts. This can be verified by joining the two dimension tables, DimProduct and DimProductSubcategory. Here is the SQL query and the result.

Multi-Bib-shorts-correction

 

 

 

 

 

Here is the original post.

Cartesian product can be illustrated by a Cartesian square

Most SQL developers are familiar with the concept of CROSS JOIN and Cartesian product. Cartesian product is the result that is produced from a CROSS JOIN.

In this Wikipedia article, Cartesian product, a Cartesian product is illustrated by a Cartesian square as shown below.

Cartesian Square

 

 

 

 

 

 

 

In SQL Server, CROSS JOIN returns the Cartesian product of rows from tables in the join

Both of the following articles claimed that the SQL CROSS JOIN produces a result set which is the number of rows in the first table multiplied by the number of rows in the second table, if no WHERE clause is used along with CROSS JOIN. This kind of result is called as Cartesian Product.

The above definition of the SQL CROSS JOIN can be illustrated with the following SQL script using the AdventureWorksDW database.

use AdventureWorksDW
go

with color as (
select      distinct Color
from  dbo.DimProduct
),

sub as (
select      distinct EnglishProductSubcategoryName
from  dbo.DimProductSubcategory
whereEnglishProductSubcategoryName = ‘Bib-Shorts’
)

select      *
from  sub CROSS JOIN color
order by
EnglishProductSubcategoryName, Color
–10

To simplify the result set, I am using only one product subcateogry ‘Bib-Shorts’. The above SQL script generates the results shown in the screenshot below. With 1 product category and 10 product colors, the result set is the every possible combination of these two lists, which is 10 (1 * 10) rows.

Multi-Bib-shorts

 

 

 

 

 

 

 

 

 

 

Keep in mind that the above 10 combinations of the colors and the subcategory do not necessarily have any sales. In order to find witch combinations have sales, we will need to do an INNER JOIN on these 3 tables:

dbo.FactResellerSales or dbo.FactInternetSales
dbo.DimProductSubcategory
dbo.DimProduct

The following 2 SQL queries can be used. It turned out that there is no internet sales for multi color bib-shorts, but there are reseller sales for them.

— There is no Internet Sales for Multi color Bib-Shorts
select      *
from  dbo.FactInternetSales fact join dbo.DimProduct prod
on          fact.ProductKey = prod.ProductKey
join  dbo.DimProductSubcategory sub
on          prod.ProductSubcategoryKey = sub.ProductSubcategoryKey
wheresub.EnglishProductSubcategoryName = ‘Bib-Shorts’
and         prod.Color = ‘Multi’
–0

— There is 756 Reseller Sales for Multi color Bib-Shorts
select      *
from  dbo.FactResellerSales fact join dbo.DimProduct prod
on          fact.ProductKey = prod.ProductKey
join  dbo.DimProductSubcategory sub
on          prod.ProductSubcategoryKey = sub.ProductSubcategoryKey
wheresub.EnglishProductSubcategoryName = ‘Bib-Shorts’
and         prod.Color = ‘Multi’
–756

In MDX, CROSS JOIN does not return the Cartesian product members from hierarchies

Enter the world of MDX. CROSS JOIN is an important operation in MDX, with some twists and turns.

The MDX query below is the equivalent of the SQL CROSS JOIN in the previous section. However, the result set contains only one row.

select
{ } on 0,
{ CROSSJOIN(
[Product].[Subcategory].[Subcategory].[Bib-Shorts],
[Product].[Color].[Color].members
)
} on 1
from
[Adventure Works]

Multi-Bib-shorts Crossjoin

 

 

 

 

 

With the CROSSJOIN operation in MDX, we get only the combinations that actually have fact data. We now know that out of the 10 colors of Bib-Shorts only the Multi color Bib-Shorts has sales data (fact data).

Unlike the SQL CROSS JOIN, where we had to join one fact table and two dimension tables to find out which color of bib-shorts have sales, in MDX, the CROSSJOIN function automatically took into the consideration of the fact table. I have to say that the twists and turns are for the best.

Please refer to the section ‘Finding related members in the same dimension“ in Chapter 5 of our book, MDX with SSAS 2012, to gain more knowledge on this topic and how it is related to the concept of auto-exists and the EXISTS and EXISTING function.

Three different ways to do crossjoining

In MDX, you can cross join different hierarchies using three different ways. Chris Webb has a blog here, The rather-too-many ways to crossjoin in MDX, which is better than anything I can write about CROSS JOIN.

 

T-SQL #46 – Generate a date series using a recursive CTE

April 20, 2011 Leave a comment

I recently had a need to use recursive Common Table Expression (CTE)  query to meet a real business requirement.

Before I get into my real business situation, I want to blog about a couple of simple examples of using recursive CTE I found on beyondrelational.com (Google tsql challenges). I don’t who those people are, and how they collaborate on their work of providing those T-SQL challenges and picking winners for best solutions, but they are doing a fantastic job of promoting SET based query writing skills. According to this article T-SQL Challenge #1 on SQLServerCentral.com, TSQL Challenge Committee is a group of SQL Server experts from around the world. Many of them are SQL Server MVPs, some are SQL Server consultants and others are developers working with SQL Server for a number of years.

Anyway, here is the one simple example from beyondrelational.com of generating a date series using a recursive CTE. I put my comments here to make a couple of points.

image   

1) Using UNION ALL and refer to the CTE makes the query recursive.

2) It’s important to know when to terminate the recursive query. The WHERE clause here serves the purpose of terminating the recursive part by setting a upper limit for the date.

If you are new to writing SQL queries, this might not make much sense to you. You might wonder why cannot you use a cursor or a WHILE loop. Well, the purpose here is to write SET based query rather than record-by-record sequential “program”.

Even a simple example like this has real life usage. You can use this date series to do a Cross Joint (Cartesian Product) with a dimension table, such as Product, to create a product catalog.

%d bloggers like this: