Warehouse Design Tip / Partitioning

Introduction

This topic seems so simple, and so obvious to me, and do not mean to insult anyone’s intelligence.  However, it is shocking to me the number of people that I have talked to (running large databases) that are not taking advantage  of one of the best performance, and manageability enhancements out there!  Think about the classic phone-book example everyone uses for index’s…. but broaden the example a little.  Consider you have a phone book from NY City, LA, Miami, and Chicago all combined into a single book…. even well indexed it is going to a little bit of effort to find all of the records for a given name in a given city.  Now if you partition that giant book on city, you are not even going to go look in the books that you’re not interested in. …OR… lets say you wanted to find a given name in all four cities, if you have one giant book… you will start looking… if you have the book partitioned into separate books, you could hand one book to three of your friends (threads) and each of you start looking in parallel… clearly divide and concur will get you the answer much more quickly.  

 In a warehouse environment I suggest finding some sort of logical annual partition.  It may be a calendar year, it may be a fiscal year, it may be some annual business cycle.  Make sure what you pick will make sense to your users (to make use if this power they need to include it as filter/join criteria in their queries).  If you pick an annual partition based on calendar year, however the business data is always looked at based on a fiscal year ending in July, even if the queries are well written to filter on the year, you will always use 2 partitions to answer a question because the fiscal year splits 2 calendar years.  

Example

 You can use the wizard in SSMS, or scripts either way you will need to create a partition function  

CREATE PARTITION FUNCTION [year_PF1](INT)
   AS range LEFT
   FOR VALUES (2005, 2006, 2007, 2008,
               2009, 2010, 2011, 2012,
               2013, 2014)  

Once you have the partition function, you will need the partition scheme to be associated to the files groups using a partition scheme  

CREATE PARTITION SCHEME [year_PS1]
   AS PARTITION [year_PF1] TO
   ([data_2005], [data_2006],
    [data_2007], [data_2008],
    [data_2009], [data_2010],
    [data_2011], [data_2012],
    [data_2013], [data_2014],
    [data_2014])  

Now you can create a table on the partition scheme allowing all data to be grouped together by year. The key portion of this statement is the “on” at the end of the table create statement.  

CREATE TABLE partition_example
   (      
      example_year  INT,
       fname         VARCHAR(100),
       lname         VARCHAR(100),
       rec_create_dt DATETIME,
       rec_mod_dt    DATETIME,
       audit_key     INT
   )  ON year_ps1(example_year)  

   

Conclusion

This is an overly simple example, however, please do yourself the favor and understand how partitioning works.  See my prior discussions on disk management to see how you can simply move data files from one drive to another, as well as the great optimization, and parallelism performance improvements you will see.

Tags: , ,

One Comment on “Warehouse Design Tip / Partitioning”

  1. kmescha June 22, 2010 at 5:03 pm #

    Nice job making this very simple to understand.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: