søndag den 16. oktober 2011

A new approach of building a dimensional data model

I was once presented an 'untraditional' approach of building a dimensional data model and I'll try to briefly introduce it to you.
I think it's easier to exemplify it with just a fact table Fact1 and a Date dimension but imagine the 'extension' applying for all your dimensions and fact tables.

1) Let's say you have a Date table with just one column [datetime] type [datetime] ( clustered index on it ) and a fact table Fact1 with two columns a DateKey and Measure1.

2) You create a Keys table with a statement like :
CREATE TABLE [dbo].[Keys](
[sk_key] [int]
IDENTITY(1,1) NOT NULL,
[Dimension] [nvarchar](50) NOT NULL,
[Attribute] [nvarchar] (50) NOT NULL,
[AttributeValue] [nvarchar] (250) NULL,
[ValidFrom] [int] NULL,
[ValidTo] [int] NULL
) ON [PRIMARY]


You create a clustered index covering all columns and an nonclusted index on sk_key.
The Keys table will contain all the attributes and their values for all dimensions and a track of their changes in order to support evt. SCD type2 dimensions.
You find a way - by writing a stored procedure -  to populate the keys table for all your dimensions and adapt your ETL flows to 'target'  this table.

3) Your dimensions will be based on views which basically join the 'real' table with the keys table or are entirely based on the 'real' table.
For the Date dimension the view may look like :

CREATE VIEW [dbo].[DateView] AS
SELECT cast (convert(nvarchar, [datetime], 112) as int) as DateKey,
convert (nvarchar, [datetime], 112) as DateName,
case when DATEPART(D, [datetime]) < 10 then '0' + DATENAME(D, [datetime])
else DATENAME(D, [datetime])
end
as DateShortName,
etc. , etc, etc......
from
dbo.Date

but for an Organisation dimension the statement may look like :
CREATE
view [dbo].[EmployeeView] as
select k1.sk_key as EmployeeKey, d.EmployeeName, d.EmployeeBnr, d.EmployeeTitle, k2.sk_key as EmployeeGroupKey, EmployeeGroup
from Employee d
inner join Keys k1
on d.EmployeeBnr=rtrim(k1.AttributeValue)
inner join Keys k2
on d.EmployeeGroup=rtrim(k2.AttributeValue)
where k1.Dimension='Employee' and k1.Attribute = 'EmployeeBnr'
and k1.Dimension = 'Employee' and k2.Attribute = 'EmployeeGroup'


4) You are almost done. You add all the views you created to the Data source view  - one for each 'dimension' table - DataView in our example- and the fact tables - Fact1 in our example - as well.

If you have any experience with this approach, balancing it's pros and cons, then I'd like to get
your feedback.

Ingen kommentarer:

Send en kommentar