**EDIT** I don’t normally go back and edit things in a prior post… but… in this case the additional link to Werner Vogels blog that came out after I wrote this is worthwhile to the topic at hand. **EDIT**
As I alluded to in the prior post, I have been busy… I have actually been busy writing, and now that all of the first-draft copies have been delivered to the publisher and I am in the mode of writing things. I figured I would grab a small topic for a short post. We are not far enough along in the publishing process for me to make formal book announcements yet, but it will become clear relatively soon.
OK, so you are creating a Redshift cluster and you want to enable encryption… not a bad Idea, but what does that mean in Redshift you ask? Not a bad question, as the documentation is relatively sparse on what is going on in this part of the engine. Normally, when you enable encryption on something, the first thing you need to do is provide some sort of strong key… so here you have a checkbox?… well yes, that is in-fact what you have. It is really not a bad thing, as the key would have to be stored someplace at Amazon anyway (more on that in a second). Basically, your Root-key (private-key) is generated for you, so it is likely something better than you are going to type in anyway. That generated key is then stored on your Amazon Control Plane network (as are your other AMI credentials, keys, SSH Keys etc.). That control plane is on the other side of a firewall from your physical instance of whatever you are running (in this case Redshift). That authentication through the firewall is done at startup of your instance, which is authenticated before that key is provided as a valid key. That key is then kept in memory (never written to disk) on the cluster that is running Redshift, which will allow for the decryption of your data. The data is encrypted (using hardware accelerated AES 256 encryption) before it is put back on disk, so any data at rest will be encrypted (much like Microsoft transparent data encryption TDE). Thus, if one of those drives, backups or other data were to somehow become compromised, there is nothing anyone could do with it, as they do not have the root key, which can only be obtained by your valid Redshift cluster through the firewall to the control plane network.
So, the reason (either way) you would have to store the key at Amazon is quite simply, otherwise you would have to be physically involved in the starting of the cluster. After maintenance, after a reboot, after a crash… no matter the reason… the cluster cannot start without first obtaining the key. If that key is not stored in a way that allows for automatic authentication, and access to that key, quite simply, the cluster could not start. If that were the case, it would require you to access a console, command line, or other interface to provide the key externally for each start-up of the cluster.