Deploying native code artifacts – binaries of Java, C#, C++ - is done by copying the new binaries and overriding the existing ones (current state has no effect on the binary content). Deploying database code changes is done by changing the structure of the database or schema from the given state (current state) to the target state (end point). When executing a script in the database, the given state (current point) must be the same as it was when the script was generated; otherwise the outcome is not predictable.
Continuous Delivery for Databases
Implementing Continuous Delivery for databases requires answering the unique challenges of database change management and using the same principles of Continuous Delivery practiced for native code.
Continuous Delivery is all about automation - automating everything. In order to automate we must have confidence in the automation process regardless if it’s through scripts, a CI tools or by an ARA (Application Release Automation) tool. If you can’t be confident the automation will raise a red flag in case of error (or a suspected error), then after the first or second failure, people will stop using that specific automation.
What can be automated in database continuous delivery? Almost all the steps; the build (generating the SQL scripts), the deploy (executing the SQL scripts), and test (verifying the scripts were executed correctly and after executing the script the structure produces the desired outcome).
Dev -> Build -> Deploy Process
Database Code Development -> Build -> Deploy
Automated Database Changes Build
Having an automation process means that the SQL scripts generated in the Build phase are automatically executed in the Deploy phase (without any human intervention). If the scripts are generated incorrectly, for example if the order of the commands does not consider database dependencies, or it promotes changes from a specific development environment and ignores changes made by different source environment, (such as a different branch, emergency fix, pre-prod, UAT) the end result can cause downtime to the organization.
Building the script is done by comparing between the object structure (or content) in the source control repository, which has the desired structure, and the current state of the object in the higher environment of the process (QA, SIT, DIT, Pre-Production etc.). When doing this analysis, some questions should be asked to decide if the change should be included in the script or not.
- Should the entire schema/database structure be analyzed or just objects which were changed based on tasks, user-stories, requirements, change requests etc.?
- Does the change between the objects’ environment originate from the development environment, and should therefore be promoted?
- Does the change between the objects’ environment originate from a different environment and should therefore be preserved (or should it be skipped and ignored)?
- Does the change between the objects’ environment create a conflict between the object and the code? And if so, how should it be merged? (For example: line 2 was changed in the target environment and line 4 was changed in the source environment)?
Simple Compare & Sync vs. Baseline Aware Analysis
The last thing we want is to have a script containing commands that incorrectly change an object. For example: if the script was generated using the DIT environment, then the object structure in the DIT is the correct state; while in the developer private environment, the object structure is an old version (see question #2). Or if the script was generated using the UAT environment, maybe the object was changed from the emergency fix branch as well as from the development branch and now there is a conflict (see question #3).