As a DBA, establishing a reasonable backup schedule for your databases can be a challenging project. It requires you to balance two competing demands: the need to take image copy backups frequently enough to assure reasonable recovery time, and the need to not interrupt daily business. The DBA must be capable of balancing these two objectives based on usage criteria and the capabilities of the DBMS.
Not all data is created equal. Some of your databases and tables contain data that is necessary for the core of your business. Other database objects contain data that is less critical or easily derived from other sources. Before you can set up a viable backup strategy and schedule, you will need to analyze your databases and data to determine their nature and value to the business. To do so, answer the following questions for each database object.
• How much daily activity occurs against the data?
• How often does the data change?
• How critical is the data to the business?
• Can the data be re-created easily?
• What kind of access do the users need? Is 24/7 access required?
• What is the cost of not having the data available during a recovery? What is the dollar value associated with each minute of downtime?
It can be helpful to grade each database object in terms of its criticality and volatility. This can be accomplished using the grid shown in Figure 1. The vertical axis represents a criticality continuum that ranges from easily replaceable data to data that cannot be easily replaced. The horizontal axis represents a volatility continuum that ranges from static data that changes infrequently to volatile data that changes frequently. Use this grid to diagram each database object by estimating its relative volatility and importance to the organization. Remember, these terms are somewhat vague; you will need to analyze your data and define it along the axes based on your knowledge of the data and your organization.
Once you have charted your database objects, you can use the diagram as a general indicator of how frequently each database object should be backed up. The DBA in charge of each application must develop the backup thresholds for each different type of data, as suggested by the grid. In general, critical data should be backed up more frequently than noncritical data, and volatile data should be backed up more frequently than static data. The key, however, is how you define the term “frequently.” For example, 1,000 updates per day might be frequent at some shops, whereas 50,000 updates per day might be infrequent at other shops. The DBA uses the grid to determine an appropriate backup schedule for each database object. The method of backup is also affected by user access needs.
Quadrant 1 on the grid identifies the critical/dynamic data in the organization. This data is crucial to your business and it changes rapidly. As such, you must be able to recover it quickly, so you should copy it frequently. As a rule of thumb, the data should be backed up at least on a daily basis. If more than 20% of the data changes daily, be sure to make full rather than incremental backups.
Quadrant 2 represents critical but static data. Even though the data changes little from day to day, you will need to recover the data promptly in the event of an error because it is critical to the business. Be sure to back up this data at least weekly. Consider using incremental backups that are merged immediately upon completion to minimize the work required during a recovery.
Quadrant 3 represents volatile data that is not as vital to your business. You may be able to recreate the data if it becomes corrupted. Depending on the amount of data and the volume of change, you might not even back it up at all. For small amounts of data, a printed report may suffice as a backup. If the data fails, you could simply re-enter it from the printed report. Alternatively, if data is re-created nightly in a batch job, you could simply run the batch job to refresh the data. As a DBA, you will need to ensure that the data can be re-created or copied on a regular basis. In general, more than a weekly backup for quadrant 3 data is likely to be overkill.
Quadrant 4 represents static, noncritical data. Such data does not change much and can be replaced easily. It is the least important data and should be addressed only when data in the other three quadrants have been adequately backed up. In fact, quadrant 4 data may never need to be backed up—the DBA could take a similar approach to that described for quadrant 3.