Backup Encryption
Originally published by SysAdmin Magazine, March 2007Contents
IntroductionBackup scripts with encryption
Encryption with Amanda
Which algorithm?
Performance
Key Management
Conclusion
References
Introduction
Hardly a week goes by where you don't hear the stories in the news about companies leaking important data through losing their backup tapes. Whether it is through malicious theft, opportunistic snatching, or accidental misplacement there is a huge cost to business when data is lost. When the data contains sensitive information about members of the public, possibly including bank account and credit card numbers, the cost can be severe indeed. Simply the stigma of having to notify clients that you've lost their data is extremely damaging. For example, in June 2005 Citigroup was forced to issue a media statement admitting that tapes containing personal information on 3.9 million customers had been lost while they were being carried to another site. In July 2006 Chase staff accidentally discarded five computer tapes containing personal information and financial records of 2.6 million Circuit City credit card account holders. Even though the tapes were in a locked box, the company was forced to release a public statement and was forced to notify and compensate all of those customers. And this threat isn't new at all. In 1977 a disgruntled system administrator at Imperial Chemical Industries (ICI) in the UK stole a set of backup tapes and attempted to extort a large amount of money from the company for their return. He was eventually captured. Despite the obvious and well-known risks, I've found that few companies actually use encryption to protect their backups. And this is quite surprising considering how easy it is to do, and just how costly it is to lose data. In this article I'll show you how to add encryption to your backups - and some of the pitfalls to watch out for.Backup scripts with encryption
There are a couple of good open source tools which will do encryption nicely. Probably the best one is OpenSSL, which supports a wide range of ciphers and is very easy to add into existing backup scripts. Quite simply, the command:openssl enc -aes-256-cbc -salt -pass pass:123456will encrypt standard input with the key "123456" and write to standard output. Note that the output stream will be 8-bit binary unless you specify the -a option for base64 output encoding. The -salt option should always be used to use a salt in the encrypting key derivation functions. For the key, or password, you really shouldn't specify it on the command line like in the example above because any user will be able to see it if they run the 'ps' command. It is better for automated scripts to put the key into a file which is set to be read- only by root and use the file:filename construct as in the following backup example:
tar cvf - / | \
openssl enc -aes-256-cbc -salt -pass file:/root/.backup_key >/dev/rmt0
To decrypt with OpenSSL, use the -d option:
openssl enc -d -aes-256-cbc -pass file:/root/.backup_key </dev/rmt0 | \ tar xvf -GPG is also a good tool for encryption, and a number of encrypting backup systems prefer to use GPG. GPG is well-known as a public-key, or asymmetric, encryption tool. This allows the data to be encrypted with one key and require a separate key for the decryption. Unless you have very specific requirements for your backups you probably won't need this functionality; for most backups symmetric encryption is best. To encrypt with GPG in symmetric mode, use commands similar to the following example:
GPG_KEY=/root/.backup_key
tar cvf - / | \
gpg --batch --disable-mdc --symmetric --cipher-algo AES256 \
--passphrase-fd 3 3<$GPG_KEY >/dev/rmt0
Decrypting is similar, but with the --decrypt option:
GPG_KEY=/root/.backup_key
gpg --batch --quiet --no-mdc-warning --decrypt --passphrase-fd 3 \
3<$GPG_KEY </dev/rmt0 | tar xvf -
Note how we obfuscate the location of the key file from the running command
line, that users would be able to see with the ps command, by reading it in
through file descriptor 3.
Encryption with Amanda
Amanda is one of the best open-source backup management packages available. It is very flexible, scalable, and very easy to customise and extend. The latest version (I used Amanda 2.5.1) has extensions and plugins to perform encryption with either the client or the server doing the hard work of encrypting the data. Assuming you've already got Amanda installed and setup to run your backups, adding encryption for backups (... and decryption for restores) is very easy to do. On the Amanda server, configure a dumptype in /etc/amanda/DailySet1/amanda.conf (DailySet1 is the name for the backup set) to include an encryption plugin: define dumptype client-encrypt-ossl {
global
program "GNUTAR"
comment "no compression and client symmetric encryption with OpenSSL"
compress none
encrypt client
client_encrypt "/usr/sbin/amcrypt-ossl"
client_decrypt_option "-d"
}
The options are quite self-explanatory. This specifies that the backup set will
use encryption and that encryption will be performed on the client system using
/usr/sbin/amcrypt-ossl which is a simple shell script calling upon openssl to
perform the encryption. Encryption is also performed on the client using the
same script but with the "-d" option added.
Also on the Amanda server, change the dumptype for the directory on the client
to be backed up, in /etc/amanda/DailySet1/disklist:
gorgon.crypt.gen.nz /home client-encrypt-osslThis will tell Amanda to use the client-encrypt-ossl when backing up /home on the backup client system gorgon.crypt.gen.nz. On the client to be backed up, check that /usr/sbin/amcrypt-ossl is installed (it comes with Amanda v2.5 releases), and then configure an encryption key, or passphrase, for the backup:
echo "amanda_encryption_key_86299993456" >/var/lib/amanda/.am_passphrase
chown amandabackup:disk /var/lib/amanda/.am_passphrase
chmod 700 /var/lib/amanda/.am_passphrase
Naturally, you should make up your own key to put into the .am_passphrase file,
and remember to store a copy in a safe place. Now you're ready to run the
backup. Check that everything is fine on the server:
$ amcheck DailySet1
Amanda Tape Server Host Check
-----------------------------
Holding disk /var/dumps/amanda: 2090172 KB disk space available, using 1987772 KB
slot 3: read label `DailySet1-03', date `X'
NOTE: skipping tape-writable test
Tape DailySet1-03 label ok
Server check took 0.316 seconds
Amanda Backup Client Hosts Check
--------------------------------
Client check: 1 host checked in 0.449 seconds, 0 problems found
(brought to you by Amanda 2.5.1p2)
$
and then run
amdump DailySet1to perform the backup with encryption. The results will be emailed to the administrator specified in /etc/amanda/DailySet1/amanda.conf. Whenever you change your backups, always test that it is working. In the case of setting up encryption you need to test that the data on tape is actually encrypted, and make sure that you can read it back as well! You can change the cipher used for encryption by editing the encryption script /usr/sbin/amcrypt-ossl on the Amanda client system, the standard release uses aes-256-cbc which will be fine for most purposes.
Which algorithm?
It is important to choose the encryption algorithm, or cipher, carefully. Some commercial backup systems may not have the option to choose the cipher, but with OpenSSL there is a big choice of algorithms to choose from. The version of OpenSSL I used for testing (v0.9.8a) supports all of these:aes-128-cbc aes-128-ecb aes-192-cbc aes-192-ecb aes-256-cbc aes-256-ecb base64 bf bf-cbc bf-cfb bf-ecb bf-ofb cast cast-cbc cast5-cbc cast5-cfb cast5-ecb cast5-ofb des des-cbc des-cfb des-ecb des-ede des-ede-cbc des-ede-cfb des-ede-ofb des-ede3 des-ede3-cbc des-ede3-cfb des-ede3-ofb des-ofb des3 desx rc2 rc2-40-cbc rc2-64-cbc rc2-cbc rc2-cfb rc2-ecb rc2-ofb rc4 rc4-40Of course, some of these like base64 really aren't for encryption. GPG supports a smaller range of ciphers, version 1.4.2 supports the following:
3des, cast5, blowfish, aes, aes192, aes256, twofishI recommend AES-256-CBC as a good option. AES is a fast algorithm, the 256-bit key size is long enough to be secure against attacks for a good few years and it is small enough to be fast. CBC designates the algorithm used to apply the AES to a data stream (remember, AES is a block cipher and only processes discrete blocks of information). Don't use the ECB block mode - this just isn't very strong in situations such as backups where the blocks of data at the start of the unencrypted stream are fairly predictable. You should avoid plain DES because it has a relatively short 56-bit key length, triple-DES is suitable for using in backups but it is quite slow as will be shown in the next section.
Performance
Of course, you don't get anything for free. Encrypting large volumes of data can have quite a high impact on CPU usage depending on which algorithm you use, and performance impacts can be one of the major issues preventing companies from implementing encryption. I performed some basic tests with OpenSSL using a number of ciphers to encrypt 8 gigabytes of data on a 1.0GHz Intel server with the output just going to /dev/null. The encryption time given in the following chart.
Key Management
You must bear in mind that with encryption the key is everything. If you start writing encrypted backups for a month or so, then lose they key you were using - then all of othat backup data is useless. Don't even think about trying to decode it without the key. So in this respect you must keep your key data just as secure as you would your backup media. Print out the key data on paper and store it in the company safe - and maybe an offsite secure location as well. If you have a very long key file, then save it to a number of different types of media (CD-ROM, USB drive, floppy disk) and store them somewhere safe. Keys should be changed on a regular basis, and when employees and contractors leave the company. But be aware that losing a key will cause the loss of the backups made with that key. Always keep old keys in safe storage. Always be aware and make plans for what you're going to do if the worst happens, such as if the data centre burns down. Will you be able to quickly recover your backups?Conclusion
Encrypting backups is a very simple thing to do, and should be mandatory if your organisation is sending tapes offsite. If you are using commercial backup tools, like Netbackup, there should also be encryption options which you should investigate and make use of. In this article, I've shown how to configure encrypted backups using OpenSSL and GPG encryption tools - if your backups are driven by plain shell scripts you should have no problems incorporating encryption. I've also shown you how to configure Amanda to use encryption - which is very straightforward in the latest versions of Amanda. Be aware that some other types of backups, for example those which create a bootable recovery system like mksysb on AIX and Ignite on HP/UX may not be able to encrypt their data - so take special care when using these to make backups. On Linux mkcdrec is a very useful backup system which can create a bootable recovery CD and encrypt the filesystem data that is written onto the CD image - it prompts the user for the key when it performs recovery functions.References
- OpenSSL http://www.openssl.org
- GPG http://www.gnupg.org
- Quick and Secure Network Backup Setup using Amanda http://amanda.zmanda.com/quick-backup-setup.html