Data encryption performs two purposes: it protects data against internal prying eyes, and it protects data against external threats (hacking, theft of backup tapes, etc.) Encryption in the database tier offers the advantage of database-caliber performance and protection of encryption keys without incurring the overhead and additional cost of using a third-party encryption tool in an application tier.
There are a variety of design considerations that affect storage and performance. This article explores best practices and design considerations in column-level encryption as practiced with Sybase Adaptive Server Enterprise 15 (ASE 15) and provides information for designing an encryption scheme for a secure server. There are many decisions to make, from encryption options to storage of encryption keys.
There are two basic components to data encryption: encryption, which stores and protects the data, and decryption, which retrieves and unscrambles the data.
For encryption, an encryption key must be created and managed, and permissions need to be set for logins who should have access. For decryption, transparent selection from the appropriate columns is critical.
The server may contain one encryption key for the server, one per database, one per column or any combination thereof. ASE uses a symmetric encryption algorithm, which means that the same encryption key is used for both encryption and decryption.
When the client process accesses an encrypted column, the server knows which encryption key was associated with the column. At insert or update time, the column is transparently encrypted immediately before writing the row. At select time, the server decrypts the data immediately after reading the data. This is all done transparently to the user, if the user has both select and decrypt permissions on the column.
There are a variety of decisions to make before embarking upon an encryption solution. CEKs can be stored in each database, in a separate database or some combination (potential dump / load issues here).
You have to choose the number of active CEKs to keep and the size of the encryption key. Choices include 128-192-256 bits. The bigger the key, the better the encryption, but the higher CPU cost of encryption.
There is also the use of an init vector or pad the CEKs as well as column usage: is it searchable / joinable / are there any relational integrity issues?
Finally, how often (and/or) will encryption keys change? This requires the reencryption of all data that uses the key.
Column encryption in ASE uses Advanced Encryption Standard (AES) with 128, 192 or 256-bit encryption key sizes. The longer the bit string, the more difficult it is for a hacker to decrypt. On the other hand, the more complicated the encryption, the more CPU resources will be taken up by the encryption / decryption algorithm.
Most public, commercial products use 128-bit encryption, and the government uses that up to the SECRET level; TOP SECRET requires 192 or 256-bit encryption. It's been calculated that cracking a 128-bit algorithm requires 2120 operations, which is currently not considered feasible (Source: Wikipedia).
ASE uses the system encryption password (set up by the Key Custodian) in conjunction with random values to generate a 128-bit key-encryption key (KEK). The KEK is in turn used to encrypt (prior to storage) all of the CEKs you create.
Initialization vectors (init_vector) and column padding (pad) might be a reason that you have multiple CEKs. You might have one which uses the default, random vector, and another which you use for searchable strings.
The encryption key is referenced at table creation time (or via the alter table, which will dynamically encrypt all of the target data). The CEK may exist in the local database or a remote database. After the CEK is created, select permission must be granted on the key to the group (role) that will be creating / altering the tables.
Encrypting in a database other than the secure database provides an additional layer of security. If the database dump file is ever stolen, the encryption key is nowhere to be found. In addition, the administrator or operator can password protect the database dump, making things that much harder for a hacker. Storing keys in a different database mandates synchronization of database dumps.
Database dumps and the dumps of the key databases should be stored in separate physical locations. This prevents loss in case an archive is stolen.
Encryption keys should be changed on a periodic basis. Note that table scan plus encryption time can add up to a significant maintenance window.
Encrypted columns take up more storage space because of the data column changes which add 16-byte encryption information + offset lengths. If encrypted data has a significant number of rows, this must be taken into account at capacity planning time.
Encryption is CPU-intensive, which will vary based upon the number of CPUs, ASE engines, system load, concurrent sessions, encryption per session, encryption key size and length of data. The larger the key size and the wider the data, the higher CPU utilization will be. Therefore only encrypt columns that require the extra security.
Index lookups are efficient because they look up and compare ciphertext values. This is very efficient for random row retrieval (or Referential integrity checks), not so much for range searches or sorts.
Protection exceptions are sloppy, unless your business requirement specifically states that it wants an exception thrown. In order to avoid protection errors, you simply set up a default at table creation time.
Columns characteristics | Recommended encryption key properties |
Low cardinality data | Key with Initialization Vector or random padding |
High cardinality data (SSN, Phone#, Credit Card#) | Key without Initialization Vector and random padding |
Primary key columns and indexed columns | Key without Initialization Vector and random padding |
Foreign key columns | Same key as referenced primary key (fully qualified name of the key should match) |
Columns used in joins | Same key without Initialization Vector and random padding |
In CEK selection, it is a common practice to use a single key for a single "type" of data. For example, a social security number might get the same key regardless of the table it's in. Keys should be kept in separate databases, so that a stolen database doesn't contain the decryption key. Make sure dumps of CEK databases and the data itself are synchronized.
Change keys periodically to makes key attacks harder. Encrypt data during movement; alternatively, use replication. Also, use SSL to send data.
If you must perform a range search, consider using a (noncompromising) prefix, rather than the entire key, as an unencrypted, indexed, searchable column. As the output may contain secure data, do not use capture medatata; do audit access to tables with encrypted data.
Consider using central key management, and replicating keys out to CEK database on the target servers. Protect keys using explicit passwords and do not give key custodians select permission on encrypted data.
Use the decrypt default feature to preserve application transparency with stored procedures which may start returning scrambled results from suddenly-encrypted columns
Database encryption is in and of itself fairly cookbook. You install the encryption option, enable it, create a system encryption password, a system encryption key and then you create or alter a table column to use the encryption.
It is also fairly flexible; you can create a system-wide encryption key, or a separate encryption key for every column you encrypt.
Using 128-bit encryption is probably more than adequate, but if you're going to use 256-bit encryption, make sure your CPU capacity is up to it prior to rolling this set of decisions out to production.