In this jotting, I describe how I completely changed my way of maintaining my database scripts. In my humble opinion, this approach is superior to anything I’ve seen so far. (OK, in my not-so-humble opinion …)
The usual Way
In previous projects I usually approached evolution of the required database in this quite common way:
- Initial script to create version 1.0 database
- Add patches to add/remove/correct its structure in later releases.
Of course, all scripts and patches were version-controlled. I even started to name the patches consistently by embedding the issue-tracking id in the file name for ease of reference (each patch required an issue, also very good practice). Others have been doing the same, sometimes more sophisticated like Ruby-on-Rails’ Migration tool. Essentially, you’re building up your database structure by deltas.
However, in my latest project I have to come doubt the usefulness of this approach. Of course, it does work but is it the best approach?
One day, my project manager asked me to install the current DB structure in a new instance (we decided to give developers their DB instances). I didn’t like the fact that I had to run the initial script plus all those patches from about five releases. Possible, but also very tedious. And it would grow ever more tedious with each release. I also didn’t much fancy of gathering all patches and create a new initial script. It’s a lot of work and error-prone; how easy to forget something …
No, I needed a better approach. I think, here it was an advantage that I started progamming code initially & stayed relatively far away from DB management in the past. Of course, as programmers when we install code we (usually) don’t patch it. We just uninstall the old version and replace it with the new one (in our case, it’s uses Java web-start). Could I do something similar also with the database?
The new Approach
Here’s what I came up with and, so far, it has worked very well:
- Write scripts to define all features needed for current release,
- Organise scripts in some useful (hierarchical) structure, e.g., in folders for tables, views, packages, etc.
- Each structural unit has a setup.sql and teardown.sql script
- Some (bash) script to run the complete set of scripts
- Scripts are idempotent
- For next release, update/add/remove scripts as needed
- Write minimal patches to prepare current version for next release
- Run patches
- Run full script
It is important to note that the full script only adds missing bits, never removes features. Removing, renaming or similar changes (often irreducible) are left to the patches only. (How I do this in detail will be left to a future jotting.)
The idempotency of the scripts is an important characteristic since it ensures that I can run the scripts several times and the end results are always the same. Not sure you added the latest additions? Just run the script (again).
What are the advantages of this approach?
The biggest one for me is that I can create a DB instance from scratch at any time very easily: just run the full script. I don’t need to care about patches, don’t need to re-run the DB’s evolution to its current version each time. I don’t care about its history when I install a version for a developer.
It feels more like programming code. Of course, the patches are important and sometimes difficult but they are exactly the same as in the previous approach.
It scales better since I only install what is currently defined.
It’s more robust since I can ignore the interplay of all those patches.
I never have to gather all patches into a new baseline script. I am there already at each release; actually at every commit.
I can easily view each feature in its script; I don’t need to parse through all the patches to gather where each bit was added to arrive the current version. This is a big gain in simplicity & transparency.
I have to carefully write my scripts and patches, a bit more carefully than usual but there is little overhead once you get used to it. Actually, it forces me to write more robust scripts.
Hmmhh, can’t think of any others.
One more; the initial set-up took some time but that work can be used again in other projects. It’s a long-term amortisation but well worth it, in my view.
So far I have found little reason to have to roll back a DB instance to a previous version; if I need a full former DB instance including data I go to the backups; if it’s just a clean version, maybe including some test data, I can always check my scripts out of version control. So I can’t see the need for this feature as advertised in Rails Migration.
Ah, the answer to my initial question: No, don’t delta your DB scripts. At least, I won’t.
- 2008-02-05: Found this blog on idempotent scripts
- 2008-02-05: via this blog series on version-controlled databases