SSIS provides a robust method for making packages portable between environments such as Development –> Testing –> Production. This post will hopefully get you started with the incredible number of things you can control, scale, and make your packages portable. I am focusing mainly on password management, however, you will also see a variety of other items that can be set in configuration files. There are two places you can store your configuration data, first being in the database, the second being an xml file. There are advantages to each, I have chosen to use an XML file to isolate all of the password info into the file, and not rely on either windows authentication, or keeping a password in the package to obtain the remaining information from the database.
I have chosen to have a config file for each package. When I began this with SSIS-2005 you could not have connections in an XML file that were not used, so a single all-encompasing file was simply not possible. With SSIS-2008 (sp1) *prior to sp1 there are still issues with this erroring at runtime* you are able to create a single file with all of the connections, even if they are not used in that particular package. You will get a warning that there are connections that are not used, but the package will execute. Even though this functionality is now supported, I would still recommend a file for each package, this will allow for better control over some of the other items you will be setting in a config. For example, if you want to change the “defaultbuffermaxrows” for a dataflow (which you will), you need to be sure that all of your dataflows have unique names if you use a common xml file, additionally, you can impact other packages if you make a syntax error when you make changes. One xml file for each package allows for flexibility, isolation, and risk reduction. That is a good trade for the VERY small amount of extra work to create the individual files.
Make it all work (Connections)
If you have read any of my other posts, you will see that I am big on naming conventions, this is no exception. Do yourself a HUGE favor, and come up with a standard, and PLEASE don’t include the server names in the name of the connection. You (might) have different server names in development –> test –> production. You (might) in development host a series of source databases in a single instance that in production are on different machines etc, please keep this generic.
I have used:
type = Ado/Native/ODBC/File whatever your connecting to
server = logical server name such as dw_db (Data Warehouse Database) dw_etl (Data Warehouse ETL server) these might equate to DB links that you should keep generic as well.
Database = What your connecting to
user = who am I connecting as
This will allow you to understand everything about a connection without making assumptions. This is good for both the connection manager, but more importantly when you are editing your XML file later.
At the package level, because I have chosen to host ALL of my password information in the xml file, I set the protectionlevel to ‘DontSaveSensitive’. The default of ‘EncryptSensitivewithuserkey’ gets people into trouble…. if you imbeded the password into the connection with that setting… YOUR windows domain account is the ONLY account that can access that password information. When you migrate to production, and execute the package from a Job (unless your account is running the database service) the job will fail because the password information is encrypted and cannot be accessed by that account.
Right-click in the empty space on the control-flow surface, and select package configurations…. you will get a wizard to walk through the rest of the steps to build a config file.
This screen you select the XML (or database) configuration, and the file-name (or variable) for the configuration file. I have selected “N” as I have a local drive on my ETL server that the configurations are stored on. This allows the configurations to be stored on the server with the production passwords, and the ONLY access to those files would be the network-admin that has access to the server. Since my environments are consistent, the N drive is available I chose this method. You can setup an environmental variable on the server that you could set to “N” on one machine and “X” on another, for my environment, the option shown works best for me.
Next, you can select what items are included in the configuration file, I simply select everything for the connection managers, this will allow you to configure your connection strings, passwords, user-id’s, server-names, everything you need for your connections.
Spend some time looking through the list of things you can customize (there is A LOT) and you can get pretty creative if you need to. The other thing that I do is for each data-flow, I select the “buffertempstoragepath”, “Defaultbuffersmaxrows”, “defaultbuffersize”. (There are a few others that may make sense….but for now lets keep it simple). As you move from Development (which in my case is a desktop machine), to a testing environment (which is not sized for speed/performance), to production these values will change. My default buffer settings are MUCH smaller on my dev machine than they are in production. Another post, for another day I will talk about the tuning opportunities and how to calculate all this stuff…. but if you include it here, your package can automatically scale to take advantage of the hardware you are deploying it to.
Once you are done with this, you can name the config if you want to support environments for package generation, environment builds etc, you can leave it as configuration 1 to start with. This will depend on how you deploy packages, and if you need to use the package deployment utilities, or if you manually deploy packages to each environment (which is what I do).
Now that you have an XML file…
- Open it in BIDS, select Edit/Advanced/Format Document
- Find the configuration strings for your connections (you will see sections that contain the names you have in the connection manager)
- Add Password=xxxxxxxx; (whatever your password is for the environment you are going to move this configuration to) for each of the connection strings
- Set the User ID=XXXXX; to the appropriate user-id (I have different users in dev/test/prod)
- Set the Data Source=xxxxxx; to the appropriate server
- In the same section, you will see additional properties
- Properties[Password] (set the configured value)
- Properties[ServerName] (set the configured value)
- Properties[UserName] (set the configured value)
- Set the values you wish for Buffertempstoragepath (This is where sorts happen when you run out of memory space)
- Set DefaultBufferMaxRows (according to which environment you’re moving this file to)
- Set DefaultBufferSize (according to which environment you’re moving this file to)
Save the XML file… and you are ready to go…. the location you picked in the second step of the wizard is where this file needs to be, and will be picked up at runtime to obtain the password info (and anything else that you override).