Every good DBA understands that backing up their database data is a non-optional part of assuring data availability and integrity. As a DBA, you need to know the difference between a full image copy backup and an incremental image copy backup and implement the proper image copy backup strategy based on application needs and database activity.
A full image copy backup is a complete copy of all the data in the database object at the time the image copy was run. An incremental image copy backup, sometimes referred to as a differential backup, contains only the data that has changed since the last full or incremental image copy was made. The advantage of taking an incremental backup rather than a full backup is that it can usually be made more quickly, and it occupies less storage space (either on disk or tape). The disadvantage is that recovery based on incremental copies can take longer because, in some cases, the same row will need to be updated several times before the last changes are restored.
For example, suppose you took a full image copy of a database object early at 2:00 a.m. on Monday and then took an incremental image copy at the same time the following three mornings. The full image copy and all three incremental image copies need to be applied to recover the data. If a full image copy were taken each night, the recovery process would need to apply only to the latest image copy backup, which would contain the correct value.
Some DBMSs provide the capability to analyze a database object to determine if a full or incremental backup is recommended or required. This is typically accomplished using an option of the copy utility. If such an option exists, the DBA can run the copy utility to examine the amount of data that has changed since the last image copy backup was taken. Furthermore, the DBA can set a threshold such that a full image copy is taken when more than a specified amount of data has changed; an incremental image copy is taken when the amount of data that has changed is less than the threshold. When this option is not available, the DBA will need to set up the type of image copy backups to be taken based on his knowledge of the applications and their use of the databases.
In general, full image copy backups are recommended for database objects where 30%–40% or more of the data blocks are modified between backups. The DBA makes this determination based not just on data volatility but also on factors such as criticality of the data, availability requirements, and the functionality of the DBMS.
Favor full image copies for small database objects. The definition of “small” will vary from site to site and DBMS to DBMS. Consider using incremental image copies to reduce the batch processing window for very large database objects that are minimally modified between image copy backups. The DBA must base the full-versus-incremental decision on the percentage of blocks of data that have been modified, not on the number of rows that have been modified. In general, consider incremental copying as the table space becomes larger and the batch window becomes smaller.
Additionally, some scenarios are not compatible with incremental image copy backups. Some DBMSs permit the user to disable logging during some operations and utilities. Whenever an action is taken that adds or changes data without logging, a full image copy is required.
If your DBMS supports incremental image copy backups, it may also support merging of incremental copies. A merge utility can be used to combine multiple incremental image copy backups into a single incremental copy backup, or to combine a full image copy backup with one or more incremental image copy backups, to create a new full backup. If your DBMS supports merging incremental copies, use the merge utility to create a new full image copy directly after the creation of an incremental copy. Merge can be run with no impact on concurrent data access or the batch window. And it will minimize recovery time by removing intermediate modifications, as previously discussed.
Wise DBAs will understand all of the backup and recovery options available with their DBMS and match the backup plan to the business requirements to minimize downtime during recovery from an outage.
Craig S. Mullins is president of Mullins Consulting, Inc. He’s an IBM Gold Consultant and the author of two best-selling books, DB2 Developer’s Guide and Database Administration: The Complete Guide to DBA Practices & Procedures. Website: www.mullinsconsulting.com