Mihai's BI blog: april 2013

onsdag den 24. april 2013

A non existing sale or a sale with a 0 revenue ?

Who does not want to distinguish between a non existing event - sale for instance - and an existing event with a 0 as a numeric result, a revenue due to an promotional discount, or an exceptional 0 cost, etc. ?

Your data path may look like :

Data Source -> Data Mart -> SSAS Cube -> Client Reporting tool (SSRS, Excel)

The absence of data has to be transferred on the whole path from the Data Source to the Client.
Simply put, do not make 0's or ' ' out of the absence of data neither in DataMart, SSAS cube or at the report layer.

Your 'false friends' in this process may be :

1) Lack of awareness of this issue for at least one person in the normal chain :
ETL developer -> SSAS developer -> Client developer.

2) Using functions such as isnull() or coalesce() in the ETL processes. I still found a misconception even among skilled database developers, that avoid null values at the datamart level, as they are 'difficult to tackle in t-sql calculations'.

3) The default SSAS Nullprocessing option at the measure level is set to Automatic. A better alternative is the Preserve in my opinion. You may recall, that Preserve will preserve the NULL value, the automatic is equivalent to ZeroorBlank option. It converts the null value to zero for numerical columns or to a blank string for text-based columns.

4) You are not safe at the client level. Your Excel olap pivot table may be set to show 0 for empty cells, or you may use SSRS expressions, even a simple division by 2 for fields values may accidentally create false 0's.

mandag den 15. april 2013

Exists, the MDX variant of 'Cogito ergo sum'

While reviewing or maintaning SSRS 2008/2012 cascading parameters reports based on SSAS olap cubes, I face quite often the lack of 'narrowing' of the values of param n, based on the values of the preceding 1...n-1 parameters.
Scenario : The reporting solution deals with customer sales on time perspective.
When selecting dates from a YMD hierarchy for instance, only customer with sales within the selected period of time must be retrieved and nothing else.
And if there are no customers, the user has to stop her instead of continuing to select values for the next parameters and experience an Empty dataset message at last.
Using the cascading parameters appropriately, can save your users a lot of frustrations.

The MDX template for defining the parameters may look like :

WITH MEMBER [Measures].[ParameterCaption] AS <<[Dimension].[attribute]>>.CURRENTMEMBER.MEMBER_CAPTION
MEMBER [Measures].[ParameterValue] AS <<[Dimension].[attribute]>>.CURRENTMEMBER.UNIQUENAME
MEMBER [Measures].[ParameterLevel] AS <<[Dimension].[attribute]>>.CURRENTMEMBER.LEVEL.ORDINAL
SELECT {[Measures].[ParameterCaption], [Measures].[ParameterValue], [Measures].[ParameterLevel]} ON COLUMNS ,
<<[Dimension].[attribute]>>.Members ON ROWS FROM
( SELECT ( STRTOSET(@param1, CONSTRAINED) ) ON COLUMNS FROM
( SELECT ( STRTOSET(@param2, CONSTRAINED) ) ON COLUMNS FROM
( SELECT ( STRTOSET(@param3, CONSTRAINED) ) ON COLUMNS FROM
......................
( SELECT ( STRTOSET(@param(n-1), CONSTRAINED) ) ON COLUMNS
FROM <<Cube>>))))... )

That will only work IRL, if @param1 ... @param(n) are sourced from attributes from the same dimension due to the 'autoexists' feature.
As soon as at least one paramter is based on another dimension, the 'narrowing' effect is gone.
To reinforce it, you may replace the line :
<<[Dimension].[attribute]>>.Members ON ROWS with an Exists construction, that at a basic level may look like :
Exists ( <<[Dimension].[attribute]>>.ALLMEMBERS , <<[Other Dimension].[Other attribute]>>.AllMembers , <<Measure Group related to the two dimensions>> )

The Members in the set { <<[Dimension].[attribute]>>.ALLMEMBERS } must be related to the members in the set { <<[Other Dimension].[Other attribute]>>.AllMembers } in the measure group <<Measure Group related to the two dimensions>> .