How Does DevOps Help You Make Database Changes?Tweet Wed 08 August 2018
Editorial note: I originally wrote this post for the ASPE Training blog. You can check out the original here, at their site. While you’re there, take a look at their DevOps eBook to understand how your role fits into your organization's DevOps transformation.
Databases have always been treated with a certain level of respect---sometimes to the point that when we want to make a change to the database, we have to create scripts that contain all the commands that need to be executed in the destination environment. Of course, these sets of commands need to be reviewed first by someone else (usually a DBA team).
We need to be this careful because if something goes wrong, data could be lost. The life of a company could go down the drain. Sure, there are always backups, but let's be honest: how many of us have tried to recover from those backups recently? And how many of us do it periodically? Not that many of us can proudly say that we do.
I know I'm not alone in thinking that working with databases can turn complicated, especially as they grow. We've all suffered this pain, or at least have heard of or seen someone else deal with it. But what about if we could apply some DevOps principles to databases? It's worked so well with application code. Why don't we give it a try?
Why Database Changes Are So Hard
Usually, database changes are not an automated task because each change is different. And tools can help you to manage them by creating scripts, but there are some tools that will include sentences that will first delete a table and then create it, as if it's always about a greenfield project. This is why DBAs like to take a final review before proceeding with the deployment to production.
Another reason database changes are hard is that databases tend to be really big, especially now that we're storing all sorts of information just for the sake of it. Everyone knows big databases are problematic. When we do a change in these databases, it could take hours to finish. And what about if we don't get the results we expected? Rollbacks could take twice the time.
And it's not just that. Because of the nature of how changes are promoted, it's probable that the different database environments are not alike. This causes problems for the DBAs when running the scripts in production. Developers get informed of it, but they have no idea what could have happened because it always worked. So they spend some more time troubleshooting.
We all get it. Changes in databases are really hard to promote...but what if they weren't?
Treat Your Database Like Code
There must be a better way, right? I mean, how are companies able to do several deployments a day without considering databases? Everyone has to interact with a database---no exceptions. It doesn't matter if it has good architecture or not. Systems are there to collect information, process it, and then produce a result. That result can't just be sporadic. It needs to be preserved.
The good news is that there is a better way. What we need is to start treating our databases like code. You might have heard this before with infrastructure as code, where we define infrastructure by using a descriptive language. The scripts we've been talking about could go under version control. But it's more than that---it's about changing your mindset. If we want this to succeed, we should avoid promoting database changes manually.
All database changes need to be scripted, pushed into version control, and introduced in your current delivery pipeline. These scripts need to have a special characteristic. They should be idempotent, meaning that it doesn't matter how many times that script is run---it should always produce the same result. It's as simple as adding an "if," just to check if the field the script is about to add already exists.
Treat Every Database Change Like a Migration
There's a wonderful project that I love in the .NET stack called Entity Framework Core (EFC). When you use this framework, it doesn't really matter what database vendor you're using. You interact with the database by using objects, because it's an object-relational mapper (ORM). Although this is incredibly powerful, that's not the point of this post.
Still, what I like about EFC is that you can use migrations to promote changes. It's as easy as running a set of commands. For example, say you need to start with the following to create the initial migration script, including the initial table structure like columns, indexes, and keys:
dotnet ef migrations add InitialCreate
It will translate all the objects you've defined in the code to a SQL format. When you need to promote changes to another environment, you just run the following command:
dotnet ef database update
That command will check if the change was already applied to the database. If it wasn't, it applies the change.
When you keep iterating and doing changes, you can continue doing them in the code. Then it's just a matter of creating a new migration by running the first command you ran when you started. To promote, you just run the update command again.
Those commands are repeatable, which makes it easy to introduce them into your delivery pipeline. EFC will take care of making sure that those migrations are idempotent. Does this mean that DBAs will be out of the equation? No, there are sentences that DBAs might want to review before proceeding. The difference now is that they'll be happy because there will be a standard format in scripts. And they'll be involved early in the process, not just before a release.
But...What About Rollbacks?
It's really important to be able to manage rollbacks in databases easily. Let's keep using the EFC example. EFC creates a migration history table when you start using it. So every time you run a migration, it adds a tuple to that table to keep track of what migrations it applied to the database. Having this process is really helpful because you can also roll back the last migration. You can keep rolling back until the database is in the desired state.
Another important aspect for rollbacks is that your application needs to be able to live with new and previous versions of the database. Changes need to be backward and forward compatible. This will help you to decouple database changes from code changes. The idea is that it shouldn't matter which goes first, the code or database changes---you shouldn't have any downtime. There might be occasions when this won't be possible because a change is too complex and existing data is too big. But try to make that the exception and not the rule.
It's Not Just Automation
Tools are important. Honestly, they make our jobs easier. Automation will be easier to implement. But it's about more than tools; it's about a change in mindset for developers, testers, DBAs, managers, and everyone else involved.
It's possible to introduce database changes in DevOps. It won't be as easy as with code, but it's definitely doable. My recommendation will be, as always, to start small. Pick a non-critical or new project where you can apply these principles. It will help your team to get used to this new way of working. As one manager once told me, "I'm not sure if what you're suggesting will work, but I'm willing to be proven wrong."
So prove to your boss or the blockers that they're wrong about how DevOps principles can help you make database changes. Treat your changes as if they were code and use migrations as a way to promote them. It will amplify feedback and visibility for the team. The result will be that you can iterate your product faster, and you can go back and forth as needed.