aws-vault on a Chromebook

I have moved almost exclusively to a Chromebook for my day-to-day work
(a whole other set of blog posts - on the journey and outcome are planned), and I was missing one of the tools in belt and that was aws-vault.

If you look at the releases you will see that there is no binary available for arm.

I opened up an issue on the repository - and the answer that I got was - that it is not likely to have any binary released for ARM in the near future, I should go and compile it for myself.

I did, here are the steps.

Hope it is useful for someone in the future.


The #AWS World Shook and Nobody Noticed

A few days ago at the AWS Summit in New York there was an announcement which in my honest opinion went very noticeably under the radar and i don't think many people understand exactly what it means.
The announcement i'm talking about is this one EC2 Compute Instances for Snowball Edge
Let's dig into the announcement. There are new instance types released the sbe1 family which can been on AWS Snowball Edge device which essentially a computer with a lot of disks inside.

The Snowball is a service that AWS provides to enable you to upload large amounts of data from your datacenter up to S3. Since its inception it is actually a very interesting concept and to me it has always been as a one off way enticing you to bring more of your workloads and your data in a much easier way to AWS.

I also posted this on Twitter

Since its inception AWS has always beaten the drum and pushed the message that everything will run in the cloud - and only there. That was the premise they build a large part of their business model upon. You don't need to run anything on-premises because everything that you would ever want or ever need is available on the cloud, consume as a service, through an API.

During the course of my career a number a number of times the question came up asking, "Does AWS deploy on-prem?" Of course the answer was always "No, never gonna happen."

Most environments out there are datacenter snowflakes, built differently, none of them look the same, have the same capabilities, features or functionality. They are unique and integrating a system into different datacenters is not easy. Adapting to so many different snoflakes is really hard job, and something we have been trying to solve for many years - trying to build layers of abstraction, automation and standards across the industry. In some way we as an idustry have suceeded, and in others we have failed dismally.

In June 2017 AWS announced general availability of GreenGrass. A service that allows you to run Lambda functions on Connected devices wherever they are in the world (and more importantly - they are not part of the AWS cloud).

This is the first leg in the door - to allow AWS into your datacenter. The first step of the transformation.

Back to the announcement.

It seems that each Snowball is a server with approximately 16 CPUS's and 32GB of RAM (I assume a bit more to manage the overhead for the background processes). So essentially a small hypervisor - most of us have servers which are much beefeir than this little box - as our home labs or our laptops even. It is not a strong machine - not by any means.

But now you have a the option to run Pre-provisioned EC2 instances on this box. Of course it is locked down and you have a limited set of functionality availble to you (the same way that you have a set of pre-defined option availble in AWS itself. Yes there are literraly tens of thousands of operations you can perform - but it is not a free for all).

Here is what stopped me in my tracks

Connecting and Configuring the Device
After I create the job, I wait until my Snowball Edge device arrives. I connect it to my network, power it on, and then unlock it using my manifest and device code, as detailed in Unlock the Snowball Edge. Then I configure my EC2 CLI to use the EC2 endpoint on the device and launch an instance. Since I configured my AMI for SSH access, I can connect to it as if it were an EC2 instance in the cloud.

Did you notice what Jeff wrote ?
"Then I configure my EC2 CLI to use the EC2 endpoint on the device and launch an instance"

Also this little tidbit..

"S3 Access – Each Snowball Edge device includes an S3-compatible endpoint that you can access from your on-device code. You can also make use of existing S3 tools and applications"
That means AWS just brought the functionality of the public cloud - right into your datacenter.

Is it all the bells and whistles? Infinitely scalable, can run complex map reduce jobs? Hell no - this is not what this is for.  (Honestly - I cannot actually think of any use case that I personally would want to run a EC2 instance on a Snowball - at least not yet).

Now if you ask me - this is a trial balloon that they are putting out there to see if the solution is viable - and something that their customers are interested in using.

If this works - for me it is obvious what the next step is. Snowmobile 


Imagine being able to run significantly more workloads on prem - same AWS experience, same API - and seamlessly connected to the public cloud.

Ladies and gentlemen. AWS has just brought the public cloud smack bang right into your datacenter.

They are no longer only a public cloud only company - they provide hybrid cloud solutions as well.

If you have any ideas for a use case to run workloads on Snowball - or if you have any thoughts or comments - please feel free to leave below.


Comparing CloudFormation, Terraform and Ansible - Part #2

The feedback I received from the first comparison was great – thank you all.

Obviously the example I used was not really something that you would use in the real world – because no-one actually creates a only a VPC – and does not create anything inside it, that is pretty futile.

So let’s go to the next example.

The scenario is to create a VPC, with a public presences and a private presence. This will be deployed across two availability zones. Public subnets should be able to route to the internet through an Internet Gateway, private subnets should be able to access the internet through a NAT Gateway.

This is slightly more complicated than just creating a simple VPC with a one-liner

So to summarize - the end state I expect to have is:

  • 1x VPC (
  • 4x Subnets
    • 2x Public
      • (AZ1)
      • (AZ2)
    • 2x Private
      • (AZ1)
      • (AZ2)
  • 1x Internet Gateway
  • 2x NAT Gateway (I really could do with one – but since the subnets and resources are supposed to be deployed in more than a single AZ – there will be two – and here I minimize the risk impact of loss of service if a single AZ fails)
  • 1x Public Route Table
  • 2x Private Route Table (1 for each AZ)

And all of these should have simple tags to identify them.

(The code for all of these scenarios is located here  https://github.com/maishsk/automation-standoff/tree/master/intermediate)

First lets have a look at CloudFormation

So this is a bit more complicated than the previous example. I still used the native resources in CloudFormation, and set defaults for the my parameters. You will see some built in functions that are available in CloudFormation – namely !Ref which is a reference function to lookup a value that has previously been created/defined in the template and !Sub that will substitute a value in the template with an environment variable.

So there are a few nifty things that are going here.

  1. You do not have remember resource names – CloudFormation keeps all the references in check and allows you to address them by name in other places in the template.
  2. CloudFormation manages the order in which the resources are created and takes of care of all of that for – and it will take care of the order what resources are created.

    For example – the route table for the private subnets will only be created after the NAT gateways have been created.
  3. More importantly – when you tear everything down – then CloudFormation takes care of the ordering for you, i.e. you cannot tear down a VPC – while the NAT gateways and Internet gateway are still there – so you need to delete those first and then you can go ahead and rip the everything else up.

Lets look at Ansible. There are built-in modules for this ec2_vpc_net, ec2_vpc_subnet, ec2_vpc_igw, ec2_vpc_nat_gateway, ec2_vpc_route_table.

As you can see this is bit more complicated than the previous example – because the subnets have to be assigned to the correct availability zones.

There are are a few extra variables that needed to be defined in order for this to work.

Last but not least – Terraform.

And a new set of variables

First Score - # lines of Code (Including all nested files)

Terraform – 164

CloudFormation - 172

Ansible – 204

(Interesting to see here how the order has changed)

Second Score - Easy of deployment / teardown.

I will not give a numerical score here - just to mention a basic difference between the three options.

Each of the tools use a  simple command line syntax to deploy

  1. CloudFormation

    aws cloudformation create-stack --stack-name testvpc --template-body file://vpc_cloudformation_template.yml

  2. Ansible

    ansible-playbook create-vpc.yml

  3. Terraform

    terraform apply -auto-approve

The teardown is a bit different

  1. CloudFormation stores the information as a stack - and all you need to do to remove the stack and all of its resources is to run a simple command of:

    aws cloudformation delete-stack --stack-name <STACKNAME>

  2. Ansible - you will need to create an additional playbook for tearing down the environment - it does not store the state locally. This is a significant drawback – you have to make sure that you have the order correct – otherwise the teardown will fail. this means you need to understand as well how exactly the resources are created.

    ansible-playbook remove-vpc.yml

  3. Terraform - stores the state of the deployment - so a simple run will destroy all the resources

    terraform destroy -auto-approve

You will see below that the duration of the runs are much longer than the previous example – the main reason being that the amount of time it takes to create a NAT gateway is long – really long (at least 1 minute per NAT GW) because AWS does a lot of grunt work in the background to provision this “magical” resource for you.

You can find the full output here of the runs below:


create: 2m33s
destroy: 1m24s

create: 3m56s
destroy: 2m12s

create: 3m26s
destroy: 2m14s

Some interesting observations. It seems that terraform was the fastest one of the three – at least in this case.

  1. The times are all over the place – and I cannot say one of the tools is faster than the other because the process is something that happens in the background and you have to wait for it complete. SO I am not sure how reliable the timings are.
  2. The code for the Ansible playbook is by far the largest – mainly because in order to tear everything down – it requires going through the deployed pieces and ripping them out – which requires a complete set of code.
  3. I decided to compare how much more code (you could compare increase in the amount of code to increased complexity) was added from the previous create step to this one

    Ansible: 14 –> 117 (~8x increase)
    CloudFormation: 24 –> 172 (~x7 Increase)
    Terraform: 7 –> 105 (~x15 increase)
  4. It is clear to me that allowing the provisioning tool to manage the dependencies on its own – is a lot simpler to handle – especially for large and complex environments.

This is by no means a recommendation to use one tool or the other - or to say that one tool is better than the other - just a simple side by side comparison between the three options that I have used in the past.

Thoughts and comments are always welcome, please feel free to leave them below.


Getting Hit by a Boat - Defensive Design

In a group discussion last week – I heard a story (I could not find the origin – if you know where it comes from – please let me know) – which I would like to share with you.
John was floating out in the ocean, on his back, with his shades, just enjoying the sun, the quiet, the time to himself, not a care in the world.
When all of a sudden he got bumped on the head (not hard enough to cause any serious damage) with a small rowing boat.
John was pissed…. All sorts of thoughts running through his head.
  • Who gave the driver their license?
  • Why are they not more careful?
  • I could have been killed?
  • Why are they sailing out here – this is not even a place for boats.
And with all that anger and emotion he pulled himself over the side of the boat, ready to give the owner/driver one hell of a mouthful.
When he pulls himself over the side, he sees an empty boat. No–one there, no-one to scream at.
And at that moment all the anger and rage that was building up inside – slowly went away.
We encounter things every day – many of them we think are directly aimed at us – deliberately or not – but we immediately become all defensive, build up a bias against the other and are ready to go ballistic. Until we understand that there is no-one to direct all this emotion and energy at.
And then we understand that sometimes thing just happen, things beyond our control and we cannot or should not put our fate into some else’s hands.
That was the original story – which I really can relate to.
(Source: Flickr – Steenaire)
But before I heard the last part of the story – my mind took this to a totally different place – which is (of course) architecture related.
John was enjoying a great day in the sun – and all of a sudden he got hit in the head by a boat.
Where did that boat come from?
No-one knows.. I assume the owner had tied it up properly on the dock.
  • Maybe the rope was cut.
  • Maybe someone stole it and dumped it when they were done.
  • Maybe there was a storm that set the boat loose.
  • Or maybe there was a bloopers company that was following the boat all along to see who would get hit in the head.
There are endless options as to how the boat got there. But they all have something in common. The boat was never supposed to end up hitting John in the head.. John expected to be able to bake nicely in the sun and not be hit in the head by a boat
But what if John had taken additional precautionary measures?
  • Set up a fence / guardrail around where he was floating
  • Put someone as a lookout to warn him about floating boats
  • Have a drone above his head hooked into a heads-up-display in his sunglasses that he can see what is going around him
There are endless possibilities and you can really let your imagination take you to where you want to go as to how John could have prevented this accident.
What does this have to do with Defensive Design?
When we design an application – we think that we are going to be ok – because we expect to be able to do what we want to do without interference.
For example.
My web server is suppose to serve web requests of a certain type. I did not plan for someone crafting a specific request that would crash my server or bombarding the webserver with such an influx of traffic that would bring the application to its knees.
But then something unexpected happens.
When you design your application you will never be able to predict every possibility of attack or some esoteric ways  people are going to use your software. There is always something new that comes up – or someone thinks of a different way to use your idea that you did not even think of.
What you can do, is put some basic guardrails into your software that will protect you from what you do know or think can happen.
  • Throttling the number of connections or request – to prevent DDOS attacks.
  • Introducing circuit breakers to prevent cascading failures
  • Open only specific ports / sockets
  • Sufficient authentication to verify that you should be doing what you are supposed to
  • Monitoring for weird or suspicious behavior.
Again the options are practically endless. And you will not think of it all. You should address the issues as they happen, iterate, rinse, repeat.
That was a 4 minute read into thing that I think about during the day.
What kind of things do you think about when during your daily work? I would be interested in hearing. Please feel free to leave comments down below.


Encounters in the Cloud - Interview

This is a translation of an interview I gave to IsraelClouds (a meet the architect session).

Hello, my name is Maish Saidel-Keesing. I am a Cloud and DevOps architect at CyberArk in Petach Tikva. I have over 19 years experience in the compute industry. In the past I was a system administrator, managing Active Directory, Exchange and Windows servers. I have a lot of past experience with VMware systems - I wrote the first version of VMware vSphere Design and I have extensive knowledge of OpenStack (where I also participated in the OpenStack Architecture Design Guide). In recent years I have been working in the public cloud area (AWS and Azure) and I am also in the process of writing another book called “The Cloud Walkabout”, which was written following my experience with AWS.

What was the catalyst that made you interested in cloud computing?

My interest in technology has been ingrained in me since I was a child. I am always interested in trying new things all the time, and the cloud was for me a tool that enabled me as an IT infrastructure professional to push the organization to run faster and bring value to the entire company. 

The pace at which the company wanted to run with the standard ("old fashioned") tools was not fast enough and we headed toward the cloud (private and public) to help us meet our goals.

What difficulties did you encounter when you wanted to learn about cloud computing?

First of all organizational buy-in. At first I encountered difficulties when I tried to explain to upper management why the cloud is important to the organization, it was not obvious and required a lot of persuasion and date to back up the statements.

Second, the level of local education(courses, lecturers) was not very high at the time, which required a lot of hours of self-study and practical experience to learn one topic or another. I have never done a frontal course here in Israel - only self-study at my own pace, including 5 AWS certifications and additional certifications.

What do you predict for the future in cloud computing vs. on-prem?

I predict that the day is near where the number of Workloads run on-prem will be minimal, and the vast majority or our software will run in the public cloud. There will always be some applications that are not financially viable to be moved to the cloud or because of security restrictions cannot live in the cloud, so we will have to live in a hybrid world for many years. The world has become a cloud world, but the distance is so long that we can seamlessly transfer our applications between cloud and cloud.

‍Did the solutions you were looking for have alternatives among the various public cloud providers you worked with? If so, what were the considerations in choosing the supplier? What support did you receive from the cloud provider?

Similar to the situation today, the market leader was AWS. However, Google Cloud and Microsoft Azure have narrowed a huge gap in recent years. When I started the journey to the cloud, I worked only with AWS - they helped us with both individual and technical advice - about the existing ways to move the applications to the cloud, optimization and improvement, and in addition to the business aspect of streamlining and reducing costs. 

What are the conclusions after your transition to the cloud compared to on-premises?

It is clear to me that it is impossible to compete with a public cloud. The many possibilities offered by each of the cloud providers are a difference of heaven and earth compared to the capabilities in your datacenter. Building services that can be consumed from the cloud by simply calling a rich API can take thousands of hours, and even then, as private organizations, we can not keep up with the tremendous pace of development in the industry.

In the public cloud we have no "limitations" of resources. (Of course there are certain limitations - but no organization has a chance to match the scale that cloud providers are working with)

How does an organization know that it is ready for the cloud? What are the important points (time, staff) that will indicate readiness?

When you start to see that business processes are moving too slowly, competitors are faster than you and take too much time to respond to business requests. If the services you are asked to provide are not within your grasp, or to meet the load and demands - it will take you months or years. In these situations you need to recognize that you have reached your limit, and now it is time to ask for the "help of the cloud" and move on to the next stage in the growth of the company.

It is important to switch to the approach of becoming an enabler, a partner of the organization and accompany the business on the journey. Work together and think about how to advance the organizational agenda and not be the obstacle. If you insist on being gatekeepers and the "no, you cannot" person, you find yourself irrelevant to your organization and customers.

Finally, is there something important that you would like to see happening (meetings, studies or anything else) in the cloud computing community and / or the cloud computing portal?

Today, cloud vendors are trying to sell a story of success in every customer transition to the cloud. It is clear to me that this is not always a reflection of reality. For every success story I am sure there is at least the same amount - and even more - of failures.

I would like all of us to learn from these failures - they are an invaluable resource and can serve us in our attempts to go through the same process. I wish we were sharing many more of these stories. It's true that it's embarrassing, it's true that it does not bring the glamor and glory of a successful story - but it is very important that we learn from each other's mistakes.‍

Thank you for coming to Israel's cloud computing portal, we were very happy to host you.