2016-02-01

There is no Root Cause, Only Contributing Factors

I participated a week or two ago in the DevOpsJRS meetup in Cisco Jerusalem.  Our guest  speaker was Avishai Ish-Shalom. I always enjoy Avishai's talks, he is a great speaker, a down to earth guy, and I have had the opportunity and pleasure to work with him several times in the past.

One of the slides that he posted included the following:

I am currently involved in an Scrum product team, where we (try and) do retrospectives after each sprint.

For those of you who are not familiar with the Agile methodologies, a short overview and my view on the process.

Making long term plans is quite difficult, and sometimes even impossible in our ever changing world. Things are moving so fast, at such a pace, ever changing.  Scrum groups work in sprints. A sprint is a short burst of work, which can be defined by the team, but usually we are talking about 1-2 week bursts.

The team plans the work for each sprint and concentrates only those tasks at hand for that specific sprint. They produce in small increments but continously produce something that adds value.

After the sprint there is a retrospective. The team looks at what went well, what was bad, and how to improve. There is a huge amount of trust needed within in the team in order for this to be productive, and one of the things that are very important is that these are conducted in a blameless manner.

The point of such an exercise is to learn and to improve and not to point fingers.

Back to the root cause. In my previous IT positions whenever there was an outage, we did a root cause analysis to see what caused the problem. We always wanted to pinpoint that one thing that caused the problem.

2310295343_462278ae01_z

I completely agree with what Avishai said. There is no such a thing as a Root cause, there are only contributing factors. But this seems to be completely against what you might know and have been accustomed to.

Let try and demonstrate with an example.

A critical application stopped responding.
The outage caused downtime for 1 hour in your organization.

In a regular post mortem and root cause analysis, you would have gone through the motions until you think that you found was that the reason the app went down for a hour.

Why did go down for an hour?
Because the host it was running on was disconnected from the network.

Why was it disconnected?
Because John disconnected the wrong cable when working in the datacenter.

There we found the root cause. It was John's fault.

If we are looking only for a root cause, that would be it.
But remember, there is no root cause, only contributing factors.

Digging down a little deeper will uncover a lot more.

Why did John disconnect the wrong cable?
Because he was already at work for more than 24 hours fighting fires and running from crisis to crisis.
He was tired (contributing factor).
And the cables were not marked correctly. (another factor)

So it was not John's fault. There were contributing factors.

The idea of this exercise is to improve and to understand the possible things that we can learn from this event so that it does not occur again.

Possible answers could be:

Make sure that all cables are marked clearly. It would have helped here.
John was tired, over worked. Why? Because he had too much on his plate, he was overloaded.
Perhaps increase automated processes that will free up more time for John and the team.
Invest in more staff, better equipment, additional training so that John would have a better balance and have time to invest in improvement.

We must embrace outages, because they are they best learning opportunities, and the best way to improve.

I would highly recommend using this method in your next retrospective or post-mortem. I can guarantee you, that this will improve your team, yourself and the way you work.

2016-01-05

Native Mac OSX virtualization - with Veertu

I was contacted today by Izik Eidus, an old acquaintance from Ravello which I was really impressed with their technology and introduced them in this post.

I assume that not many of you know that Apple released native hypervisor functionality with their OSX Yosemite release, their Hypervisor.framework.

What this does is it allows you to run a VM natively on OSX, without the need for client hypervisor (such as VMware Fusion or VirtualBox).

Two of the main brains behind the Ravello hypervisor have now released a Native Mac OSX virtualization tool.

Say hello to Veertu.

image

It is light (20MB), supports Windows and Linux Operating Systems, has extensive useability features such as copy/paste between guest and VM, full-screen, and shared folders.

It is the only virtualization tool that is actually available in the Apple store – becuase it does not make any changes to the kernel.

It was really very simple. I downloaded the tool and started it up.

You are presented with 2 choices, create your own VM’s from ISO’s (which is a paid feature) or deploy from Veertu’s servers which has several Linux flavors.

Veertu - splash

I chose Centos 7 Minimal.

Available OS

What happens is that the client downloads the appropriate ISO image that you can install the relevant OS.

Download

(I think that the wording above could be improved because it is not actually downloading a VM, rather an ISO image)

Launch VM

Once downloaded you can change the various settings of the VM.

VM Settings

For example CPU, RAM, Disk, Network etc.

VM boot

Power it on – and your VM goes through the installation process. (this is how I realized that the client is not downloading a full VM – rather the installation ISO)

Management interface

Here is the Management interface.

Installation Complete

And after the Centos 7 installation is complete.

Veertu running VM

And here you have a VM running natively on my Mac.

Now the software is not perfect. And there are things that need to be improved, such as:

  • Each time you create a VM, it downloads the ISO again, which seems a waste of bandwidth to me (it will be changed in a future version)
  • The download was slow for me, and downloading an ISO could be faster from a local mirror – just that the only way to point to a different ISO is paying for the full product.

Of course – what was the first thing I tried to do? Build an ESXi VM

ESXi attempt

But that did not work because Apple have not enabled supported for nested VM’s (yet).

I liked the native interface. I liked the smooth integration, and would definitely keep an eye on this product. We all know that Ravello has an amazing solution which allows you to run your VM’s on any cloud, I think that this will be an interesting way to do things in the future.

And if Applefarm is hint in to where they are going, then this will definitely be interesting

Applefarm

Disclaimer:

I was approached by Izik to look at the tool. I exchanged a few emails with him, with some questions and suggestions, and I also received a development build of Veertu to test – which is similar (but does not have full feature parity) to the full version which is worth $39.99.

I was not asked to write a review.