What is site reliability engineering (SRE) and how is it different from DevOps?

Fri 05 April 2019

Site reliability engineering (SRE) is Google’s model of service management where software engineers run production systems using a software engineering approach. It’s clear that Google is unique, and they usually need to tackle software bugs and errors in different and non-conventional ways. But having software engineers doing a …

Why Transformational Leadership Is Needed in Every DevOps Initiative

Mon 04 February 2019

Why Transformational Leadership Is Needed in Every DevOps Initiative

Every DevOps initiative needs support from leadership. Without DevOps leadership support, teams won’t be able to move forward smoothly. Leaders have a significant impact on results—not just with DevOps, but with every change initiative in an organization.

This time we won’t talk about DevOps as a tool …

Pitfalls with DevOps at Scale

Mon 28 January 2019

Let's get started by defining what DevOps is.

I know, I know; there are tons of definitions. But the one I like most is from Gene Kim:

DevOps is those set of cultural norms and technology practices that enable the fast flow of planned work from, among others, development, through …

Getting Started Quickly With Go Logging

Mon 21 January 2019

It's time to talk about how to get started with logging again. The languages we've covered so far are C#, Java, Python, Ruby, Node.js, and JavaScript. Today, we're going to be talking about the Go programming language, also known as Golang. Go is a statically compiled, open-source programming language …

Deployment Smells: The 5 Most Common Deployment Mistakes

Mon 14 January 2019

Deployment Smells

Deployment shouldn't be the most nerve-racking task for sysadmins when releasing software, but many times it is.

If we’re deploying application changes to a beta system, we don’t care that much if the system goes down. Neither will your users—they know that beta means “expect problems from …

A Detailed Guide to Canary Deployments

Mon 22 October 2018

A Detailed Guide to Canary Deployments

Every time we need to deploy to production, we worry about how changes will affect the user experience. No matter what technique or strategy you use to make deployments, there are going to be times when the things that can go wrong will go wrong. It’s Murphy’s law …

Amplify Feedback with Continuous Performance

Tue 16 October 2018

Amplify Feedback with Continuous Performance

As cloud becomes the norm, we’re letting others manage much of our infrastructures for us. Cloud providers offer common metrics like CPU, memory, storage, and networking so that you can stay up to date on the health of your system. And we worry less and less about those metrics …

Why You Need an Error Budget and How to Make It Work

Wed 26 September 2018

Why You Need an Error Budget and How to Make It Work

How many times have you seen Google go down? Not many, I bet. You might not even notice it if it happened. If you did, you’d probably assume it’s an internet connection problem.

But Google isn’t perfect. As Werner Vogels says, “Everything fails, all the time.” If …

Choosing a Deployment Strategy: A Manager’s Guide

Mon 27 August 2018

Choosing a Deployment Strategy

Some time ago, I was in charge of deploying to the main system in the company I was working at. Back in those days, I had the bad habit of doing deployments manually by copying/pasting assemblies and then RDPing into the servers to update it with the latest changes …

Which DevOps Metrics Matter?

Mon 20 August 2018

DevOps Metrics

Some time ago, I decided to start dieting for the millionth time. All previous tries were a complete failure. But this last time was different.

It was different because not only was I going to a nutritionist, but also every time I visited her, she took my measurements. That way …

Newer Older