SSIS Package Design

Go To StackoverFlow.com

1

What is the best way to design a SSIS package? I'm loading multiple dimensions and facts as part of a project. Would it be better to:

  1. Have 1 package and 1 data flow with all data extract and load logic in 1 dataflow?
  2. Have 1 package and multiple data flows with each data flow taking on the logic for 1 dimension?
  3. Have 1 package per dimension and then a master package that calls them all?

After doing some research 2 and 3 appears to be more viable options. Any experts out there that want to share their experience and/or propose an alternative?

2009-06-16 20:14
by CTKeane


2

Microsoft's Project Real is an excellent example of many many best practices:

  • Package Design and Config for Dimensional Modeling
  • Package logging
  • Partitioning

It's based in SQL 2005 but is very applicable to 2008. It supports your option #3.

2009-06-23 04:35
by Christian Loris
This is the design option we ended up using and the documentation is a great example of best practices - CTKeane 2009-06-23 20:28
I've found it the same. Best of luck - Christian Loris 2009-06-24 13:58


2

You could also consider having multiple packages called by a SQL Server Agent job.

2009-06-17 02:21
by John Saunders
+1 this way gives you much more flexability (especially when it comes to error/bugs - dkarzon 2009-06-23 05:03


1

I would often go for option 3. This is the method used in the Kimball Microsoft Data Warehouse Toolkit book, worth a read.

http://www.amazon.co.uk/Microsoft-Data-Warehouse-Toolkit-Intelligence/dp/0471267155/ref=sr_1_1?ie=UTF8&s=books&qid=1245347732&sr=8-1

2009-06-18 17:56
by grapefruitmoon


0

I think the answer is not quite as clear cut ... In the same way that there is often no "best" design for a DWH, I think there is no one "best" package method.

It is quite dependent on the number of dimensions and the number of related dimensions and the structure of data in your staging area.

I quite like the Project Real (mentioned above) approaches, especially thought the package logging was quite well done. I think I have read somewhere that Denali (SQL 2011) will have SSIS logging/tracking built in, but not sure of the details.

From a calling perspective, I would go for one SQL agent job, that calls a Master Package that then calls all the child packages and manages the error handling/logic/emailing etc between them, utilising Log/Error tables to track and manage the package flow. SSIS allows much more complex sets of logic that SQL agent (e.g. call this Child Package if all of tasks A and B and C have finished and not task D)

Further, I would go for one package per Snowflaked dimension, as usually from the staging data one source table will generate a number of snowflaked dimensions (e.g. DimProduct, DimProductCategory, DimProductSubCategory). It would make sense to have the data read in once in on data flow task (DFT) and written out to multiple tables. I would use one container per dimension for separation of logic.

2011-02-21 16:09
by Marcus D
Ads