Introducing Different Testing Strategies Into Your Delivery Pipeline

Editorial note: I originally wrote this post for the Rollout blog. You can check out the original here, at their site.

Testing Strategies

With the rise of DevOps, everyone’s talking about delivering quickly. But while speed is important, so is confidence. Testing will help you increase confidence, and without tests, you can’t trust the deployment pipeline. Just like divers won’t explore the ocean without oxygen tanks they can trust, ops folks shouldn’t dive into the deployment process without having tests of good quality. You need something to protect you from creating a mess when you go live.

That’s why it’s important to include a good testing strategy in your delivery pipeline. There’s no need to introduce another silo by testing your code changes at the end of the workflow. Shift tests to the left because you need to test early, often, and with confidence. You need a way to certify that what you’re shipping satisfies the needs of the users.

But what’s a good way of introducing tests into your delivery pipeline without increasing the delivery time? Let’s explore some ideas.

Creating Tests That Run Quickly Even If the System Grows

How fast should the tests be? Well, how much time are developers willing to wait after pushing code changes? If tests take more than one hour, developers won’t be willing to test early and often, especially if we’re talking about the first set of tests that your application will be running against.

Tests should be integrated into your continuous integration (CI) workflow, but not just after a developer pushes the code changes into the master branch. One of the critical benefits of CI is that developers have rapid feedback before integrating their code with the rest of the changes from the team. That means that these tests should run fast—in a matter of minutes—and they should run locally. They also need to have a constant running time, even if the application keeps growing.

One way of keeping a steady time for tests is to parallelize. But don’t parallelize in code. Parallelization should be something external. Tests should be able to scale by running multiple groups at the same time—say, for example, a Jenkins job that triggers tests simultaneously by calling different JAR files at the same time. Or now that we’re in the era of containers, you can spin up a group of containers that will run several tests at the same time, optimizing resources and reducing run time.

Including a Minimum Set of Use Cases

A strategy that will help you be more confident is to include smoke tests. These give you the minimum amount of use cases you need to approve a change.

As you progress, tests can take longer per phase, and that’s alright. The closer you are to deploying to production, the more confident you need to be that a change isn’t breaking something. In the final production phase, you’d like to do more—add more real data—so it’s not a bad thing to take longer.

If you find a significant bug, make sure you translate that issue into a test that runs fast. That way you can check to make sure you’ve fixed the bug. Also, make sure it’s included at the very beginning of the pipeline. It could be something like converting a manual test to a unit test that will run fast. This doesn’t just apply to unit tests—it can be a more complex integrated test that runs quickly using real data. These complex tests can run while the rest of the unit tests are running.

You also may have heard about the testing pyramid, where you include a certain number of different test types in the pipeline. The idea is to have more unit than integration tests and to use unit tests to test locally. While that works, there are times when mocking (that’s why they run so fast) an object is more difficult than just using real data objects. So you can also include some integration tests at the beginning of the delivery pipeline. They won’t depend on certain data existing—they’ll create it and then remove it once they’re done. By doing this, these integration tests could also be testing other parts of the system, increasing test coverage.

So it’s not about which type of tests should go in each step. Rather, it’s a combination of speed and the importance of tests.

Using Acceptance Criteria, Not Just Code Coverage

I’ve heard that people sometimes add tests that assert to a static “true” value just for the sake of increasing code coverage. But is that the real purpose of the code coverage metric? No. The idea is that you have a test for each portion of the code. It’s really costly to have 100% test coverage, and it’s not worth it. Why? Because every time you need to make a change, most likely you’ll also need to change something in the tests.

Why don’t we focus on how many use cases we’re covering rather than just code coverage? Taking this approach will you avoid the temptation to create silly and useless tests. Practices like ATDD or BDD could be helpful because you’re focusing on the use of the system. These practices will also help you be more confident when doing refactors because you’ll be changing code, not behavior. So the code is being adapted to the tests, not the other way around.

Make sure the tests you’re including in the pipeline are providing value. When they stop doing that, evolve them or get rid of them. A good sign that a test needs to be retired is how much time you invest in fixing or adapting it to new changes.

Constructing Tests In Parallel With Code

Tests should be treated the same way as the application code. You might construct them using a different source code or even a different language. You can also benefit by using versioning and packaging them (i.e., JARs, containers, etc).

What else should you do with the application code? Put it in version control. Don’t use long-lived feature branches. Push frequently to master and use feature flags. The same principles that apply to the code apply to construct tests.

Was a test ready before the application code but now it’s causing failures in CI? No problem. Just make use of the same feature flag you’ll use in the application code and turn it on when the changes are pushed.

It’s not necessary to add another silo, like a separate team that constructs the tests. Testing should be a developer’s job. It doesn’t mean that the same developer that wrote the code writes all tests. It’s about having different roles during development sprints. If Developer A writes the code and some unit tests, then Developer B should write another set of tests that will increase confidence in the code. Next time, those developers will exchange roles. And if for some reason in your organization there’s a testing team, then it’s important that sometimes they pair program together.

Testing shouldn’t be an afterthought when a developer finishes coding. It must be integrated into your delivery pipeline, in the same way you integrate code. You might not run all tests in all stages, but it’s important that they’re built together.

Continuously Adjust Based On Results

Avoid keeping your testing strategy static. It must be evolving continuously.

You’ll always have bugs by missing tests. Get used to it and expect it. Your main job after this happens is to reproduce the bugs by including a new set of tests in the pipeline. This will break something, but that’s the idea. It should be evident that the problem exists before you even start coding the fix.

In the end, it doesn’t matter how many tests you have before going live. There’s nothing like real users for testing, and that’s why you need to continuously adjust, adapt, and evolve your testing strategy. But with the ideas above, you’re amplifying feedback and becoming aware of any bugs early.

Would you like to be notified of any new post?

Subscribe to my mailing list by filling the following form and I'll be sending you an email when I publish a new entry

* indicates required