2018-07-09

Comparing CloudFormation, Terraform and Ansible - Part #2

The feedback I received from the first comparison was great – thank you all.

Obviously the example I used was not really something that you would use in the real world – because no-one actually creates a only a VPC – and does not create anything inside it, that is pretty futile.

So let’s go to the next example.

The scenario is to create a VPC, with a public presences and a private presence. This will be deployed across two availability zones. Public subnets should be able to route to the internet through an Internet Gateway, private subnets should be able to access the internet through a NAT Gateway.

This is slightly more complicated than just creating a simple VPC with a one-liner

So to summarize - the end state I expect to have is:

  • 1x VPC (192.168.90.0/24)
  • 4x Subnets
    • 2x Public
      • 192.168.90.0/26 (AZ1)
      • 192.168.90.64/26 (AZ2)
    • 2x Private
      • 192.168.90.128/26 (AZ1)
      • 192.168.90.192/26 (AZ2)
  • 1x Internet Gateway
  • 2x NAT Gateway (I really could do with one – but since the subnets and resources are supposed to be deployed in more than a single AZ – there will be two – and here I minimize the risk impact of loss of service if a single AZ fails)
  • 1x Public Route Table
  • 2x Private Route Table (1 for each AZ)

And all of these should have simple tags to identify them.

(The code for all of these scenarios is located here  https://github.com/maishsk/automation-standoff/tree/master/intermediate)

First lets have a look at CloudFormation


So this is a bit more complicated than the previous example. I still used the native resources in CloudFormation, and set defaults for the my parameters. You will see some built in functions that are available in CloudFormation – namely !Ref which is a reference function to lookup a value that has previously been created/defined in the template and !Sub that will substitute a value in the template with an environment variable.

So there are a few nifty things that are going here.

  1. You do not have remember resource names – CloudFormation keeps all the references in check and allows you to address them by name in other places in the template.
  2. CloudFormation manages the order in which the resources are created and takes of care of all of that for – and it will take care of the order what resources are created.

    For example – the route table for the private subnets will only be created after the NAT gateways have been created.
  3. More importantly – when you tear everything down – then CloudFormation takes care of the ordering for you, i.e. you cannot tear down a VPC – while the NAT gateways and Internet gateway are still there – so you need to delete those first and then you can go ahead and rip the everything else up.


Lets look at Ansible. There are built-in modules for this ec2_vpc_net, ec2_vpc_subnet, ec2_vpc_igw, ec2_vpc_nat_gateway, ec2_vpc_route_table.


As you can see this is bit more complicated than the previous example – because the subnets have to be assigned to the correct availability zones.

There are are a few extra variables that needed to be defined in order for this to work.


Last but not least – Terraform.

And a new set of variables



First Score - # lines of Code (Including all nested files)

Terraform – 164

CloudFormation - 172

Ansible – 204

(Interesting to see here how the order has changed)

Second Score - Easy of deployment / teardown.

I will not give a numerical score here - just to mention a basic difference between the three options.

Each of the tools use a  simple command line syntax to deploy

  1. CloudFormation

    aws cloudformation create-stack --stack-name testvpc --template-body file://vpc_cloudformation_template.yml

  2. Ansible

    ansible-playbook create-vpc.yml

  3. Terraform

    terraform apply -auto-approve

The teardown is a bit different

  1. CloudFormation stores the information as a stack - and all you need to do to remove the stack and all of its resources is to run a simple command of:

    aws cloudformation delete-stack --stack-name <STACKNAME>

  2. Ansible - you will need to create an additional playbook for tearing down the environment - it does not store the state locally. This is a significant drawback – you have to make sure that you have the order correct – otherwise the teardown will fail. this means you need to understand as well how exactly the resources are created.

    ansible-playbook remove-vpc.yml

  3. Terraform - stores the state of the deployment - so a simple run will destroy all the resources

    terraform destroy -auto-approve

You will see below that the duration of the runs are much longer than the previous example – the main reason being that the amount of time it takes to create a NAT gateway is long – really long (at least 1 minute per NAT GW) because AWS does a lot of grunt work in the background to provision this “magical” resource for you.

You can find the full output here of the runs below:

Results

Terraform
create: 2m33s
destroy: 1m24s

Ansible:
create: 3m56s
destroy: 2m12s

CloudFormation:
create: 3m26s
destroy: 2m14s

Some interesting observations. It seems that terraform was the fastest one of the three – at least in this case.

  1. The times are all over the place – and I cannot say one of the tools is faster than the other because the process is something that happens in the background and you have to wait for it complete. SO I am not sure how reliable the timings are.
  2. The code for the Ansible playbook is by far the largest – mainly because in order to tear everything down – it requires going through the deployed pieces and ripping them out – which requires a complete set of code.
  3. I decided to compare how much more code (you could compare increase in the amount of code to increased complexity) was added from the previous create step to this one

    Ansible: 14 –> 117 (~8x increase)
    CloudFormation: 24 –> 172 (~x7 Increase)
    Terraform: 7 –> 105 (~x15 increase)
  4. It is clear to me that allowing the provisioning tool to manage the dependencies on its own – is a lot simpler to handle – especially for large and complex environments.


This is by no means a recommendation to use one tool or the other - or to say that one tool is better than the other - just a simple side by side comparison between the three options that I have used in the past.

Thoughts and comments are always welcome, please feel free to leave them below.

2018-07-05

Getting Hit by a Boat - Defensive Design

In a group discussion last week – I heard a story (I could not find the origin – if you know where it comes from – please let me know) – which I would like to share with you.
John was floating out in the ocean, on his back, with his shades, just enjoying the sun, the quiet, the time to himself, not a care in the world.
When all of a sudden he got bumped on the head (not hard enough to cause any serious damage) with a small rowing boat.
John was pissed…. All sorts of thoughts running through his head.
  • Who gave the driver their license?
  • Why are they not more careful?
  • I could have been killed?
  • Why are they sailing out here – this is not even a place for boats.
And with all that anger and emotion he pulled himself over the side of the boat, ready to give the owner/driver one hell of a mouthful.
When he pulls himself over the side, he sees an empty boat. No–one there, no-one to scream at.
And at that moment all the anger and rage that was building up inside – slowly went away.
We encounter things every day – many of them we think are directly aimed at us – deliberately or not – but we immediately become all defensive, build up a bias against the other and are ready to go ballistic. Until we understand that there is no-one to direct all this emotion and energy at.
And then we understand that sometimes thing just happen, things beyond our control and we cannot or should not put our fate into some else’s hands.
That was the original story – which I really can relate to.
14221418411_385101705b_z
(Source: Flickr – Steenaire)
But before I heard the last part of the story – my mind took this to a totally different place – which is (of course) architecture related.
John was enjoying a great day in the sun – and all of a sudden he got hit in the head by a boat.
Where did that boat come from?
No-one knows.. I assume the owner had tied it up properly on the dock.
  • Maybe the rope was cut.
  • Maybe someone stole it and dumped it when they were done.
  • Maybe there was a storm that set the boat loose.
  • Or maybe there was a bloopers company that was following the boat all along to see who would get hit in the head.
There are endless options as to how the boat got there. But they all have something in common. The boat was never supposed to end up hitting John in the head.. John expected to be able to bake nicely in the sun and not be hit in the head by a boat
But what if John had taken additional precautionary measures?
  • Set up a fence / guardrail around where he was floating
  • Put someone as a lookout to warn him about floating boats
  • Have a drone above his head hooked into a heads-up-display in his sunglasses that he can see what is going around him
There are endless possibilities and you can really let your imagination take you to where you want to go as to how John could have prevented this accident.
What does this have to do with Defensive Design?
When we design an application – we think that we are going to be ok – because we expect to be able to do what we want to do without interference.
For example.
My web server is suppose to serve web requests of a certain type. I did not plan for someone crafting a specific request that would crash my server or bombarding the webserver with such an influx of traffic that would bring the application to its knees.
But then something unexpected happens.
When you design your application you will never be able to predict every possibility of attack or some esoteric ways  people are going to use your software. There is always something new that comes up – or someone thinks of a different way to use your idea that you did not even think of.
What you can do, is put some basic guardrails into your software that will protect you from what you do know or think can happen.
  • Throttling the number of connections or request – to prevent DDOS attacks.
  • Introducing circuit breakers to prevent cascading failures
  • Open only specific ports / sockets
  • Sufficient authentication to verify that you should be doing what you are supposed to
  • Monitoring for weird or suspicious behavior.
Again the options are practically endless. And you will not think of it all. You should address the issues as they happen, iterate, rinse, repeat.
That was a 4 minute read into thing that I think about during the day.
What kind of things do you think about when during your daily work? I would be interested in hearing. Please feel free to leave comments down below.

2018-07-03

Encounters in the Cloud - Interview

This is a translation of an interview I gave to IsraelClouds (a meet the architect session).

Hello, my name is Maish Saidel-Keesing. I am a Cloud and DevOps architect at CyberArk in Petach Tikva. I have over 19 years experience in the compute industry. In the past I was a system administrator, managing Active Directory, Exchange and Windows servers. I have a lot of past experience with VMware systems - I wrote the first version of VMware vSphere Design and I have extensive knowledge of OpenStack (where I also participated in the OpenStack Architecture Design Guide). In recent years I have been working in the public cloud area (AWS and Azure) and I am also in the process of writing another book called “The Cloud Walkabout”, which was written following my experience with AWS.

What was the catalyst that made you interested in cloud computing?

My interest in technology has been ingrained in me since I was a child. I am always interested in trying new things all the time, and the cloud was for me a tool that enabled me as an IT infrastructure professional to push the organization to run faster and bring value to the entire company. 

The pace at which the company wanted to run with the standard ("old fashioned") tools was not fast enough and we headed toward the cloud (private and public) to help us meet our goals.

What difficulties did you encounter when you wanted to learn about cloud computing?

First of all organizational buy-in. At first I encountered difficulties when I tried to explain to upper management why the cloud is important to the organization, it was not obvious and required a lot of persuasion and date to back up the statements.

Second, the level of local education(courses, lecturers) was not very high at the time, which required a lot of hours of self-study and practical experience to learn one topic or another. I have never done a frontal course here in Israel - only self-study at my own pace, including 5 AWS certifications and additional certifications.

What do you predict for the future in cloud computing vs. on-prem?

I predict that the day is near where the number of Workloads run on-prem will be minimal, and the vast majority or our software will run in the public cloud. There will always be some applications that are not financially viable to be moved to the cloud or because of security restrictions cannot live in the cloud, so we will have to live in a hybrid world for many years. The world has become a cloud world, but the distance is so long that we can seamlessly transfer our applications between cloud and cloud.

‍Did the solutions you were looking for have alternatives among the various public cloud providers you worked with? If so, what were the considerations in choosing the supplier? What support did you receive from the cloud provider?

Similar to the situation today, the market leader was AWS. However, Google Cloud and Microsoft Azure have narrowed a huge gap in recent years. When I started the journey to the cloud, I worked only with AWS - they helped us with both individual and technical advice - about the existing ways to move the applications to the cloud, optimization and improvement, and in addition to the business aspect of streamlining and reducing costs. 

What are the conclusions after your transition to the cloud compared to on-premises?

It is clear to me that it is impossible to compete with a public cloud. The many possibilities offered by each of the cloud providers are a difference of heaven and earth compared to the capabilities in your datacenter. Building services that can be consumed from the cloud by simply calling a rich API can take thousands of hours, and even then, as private organizations, we can not keep up with the tremendous pace of development in the industry.

In the public cloud we have no "limitations" of resources. (Of course there are certain limitations - but no organization has a chance to match the scale that cloud providers are working with)

How does an organization know that it is ready for the cloud? What are the important points (time, staff) that will indicate readiness?

When you start to see that business processes are moving too slowly, competitors are faster than you and take too much time to respond to business requests. If the services you are asked to provide are not within your grasp, or to meet the load and demands - it will take you months or years. In these situations you need to recognize that you have reached your limit, and now it is time to ask for the "help of the cloud" and move on to the next stage in the growth of the company.

It is important to switch to the approach of becoming an enabler, a partner of the organization and accompany the business on the journey. Work together and think about how to advance the organizational agenda and not be the obstacle. If you insist on being gatekeepers and the "no, you cannot" person, you find yourself irrelevant to your organization and customers.

Finally, is there something important that you would like to see happening (meetings, studies or anything else) in the cloud computing community and / or the cloud computing portal?

Today, cloud vendors are trying to sell a story of success in every customer transition to the cloud. It is clear to me that this is not always a reflection of reality. For every success story I am sure there is at least the same amount - and even more - of failures.

I would like all of us to learn from these failures - they are an invaluable resource and can serve us in our attempts to go through the same process. I wish we were sharing many more of these stories. It's true that it's embarrassing, it's true that it does not bring the glamor and glory of a successful story - but it is very important that we learn from each other's mistakes.‍

Thank you for coming to Israel's cloud computing portal, we were very happy to host you.

2018-06-27

Comparing CloudFormation, Terraform and Ansible - Simple example


Whenever someone asks me what tools do you use to provision your infrastructure within AWS - the answer is it can be done with a variety of tools - but people usually use one of the following three
The next question that comes up of course - is which one is easier/better to use? The answer of course (as always..) is - "It Depends". There are really good reasons to use each and everyone of the tools. This could be ease of use, support, extensibility, flexibility, community support (or lack thereof).

I have worked with all three tools, and each of them have their ups and their downs. There are periods that I prefer Ansible, other days that Terraform and sometimes CloudFormation is the only way to get things done.

I wanted to compare all three in a set of scenarios - from the really simple to moderate - to complicated. Firstly - to see how this can be accomplished in each of the tools,evaluating complexity, time to completion and anything else that came up along the way.

Let's start by diving straight into the first example.

I want to create a VPC. A plain simple VPC, nothing else. No Network, no NAT gateways, routes, subnets, as simple as can be. Essentially this is a simple AWS API call  which would be:



(The code for all of these scenarios is located here - https://github.com/maishsk/automation-standoff/tree/master/simple)

First lets have a look at CloudFormation

Looks pretty simple. I used the native resources in CloudFormation, and set defaults for the name and the CIDR block.

Lets look at Ansible. There is a built-in module for this ec2_vpc_net.

The only difference here is that the variables are split into a separate file (as per Ansible best practices

Last but not least - Terraform.

Here the provider is split out into a separate file and the variables into another file (Terraform best practices)

First Score - # lines of Code (Including all nested files)


Ansible - 19
CloudFormation - 28
Terraform - 29

Second Score - Easy of deployment / teardown.


I will not give a numerical score here - just to mention a basic difference between the three options.

Each of the tools use a  simple command line syntax to deploy

  1. CloudFormation

    aws cloudformation create-stack --stack-name testvpc --template-body file://vpc_cloudformation_template.yml

  2. Ansible

    ansible-playbook create-vpc.yml

  3. Terraform

    terraform apply -auto-approve

The teardown is a bit different
  1. CloudFormation stores the information as a stack - and all you need to do to remove the stack and all of its resources is to run a simple command of:

    aws cloudformation delete-stack --stack-name <STACKNAME>

  2. Ansible - you will need to create an additional playbook for tearing down the environment - it does not store the state locally. This is a drawback

    ansible-playbook remove-vpc.yml

  3. Terraform - stores the state of the deployment - so a simple run will destroy all the resources

    terraform destroy -auto-approve
The last one I wanted to address was the time it took to deploy/tear down the resources for each tool - and I think that the differences here are quite interesting.

I ran a for.. loop through 3 iterations to bring up the VPC and tear it down and timed the duration for each run.

You can find the full output of the runs below:

Results


CloudFormation
create: 31.987s
destroy: 31.879s

Ansible

create: 8.144s
destroy: 2.554s


Terraform

create: 17.452s
destroy: 12.652s

So to summarize - it seems that Ansible is the fastest one of them all - there are a number of reasons why this is the case (and I will go into more detail into this in a future post)

This is by no means a recommendation to use one tool or the other - or to say that one tool is better than the other - just a simple side by side comparison between the three options that I have used in the past.

Next blog post will go into a slightly more complicated scenario.

Thoughts and comments are always welcome, please feel free to leave them below.

2018-06-04

Microsoft to acquire Github??

Microsft is currently in negotiations to acquire. Github. Github.com. Github, it's the place where we all store our code, all our open source code.

I was actually quite shocked. There is this article. The first thing that I was surprised by was that Microsoft has bean negotiations with Github for quite some time. If they do buy Github then it could possibly change the world of open source. Almost everybody I know stores their code on Github. There are a few other places where you can store your code, for example, bitbucket, but the main code depository in the world is definitely Github.

If this acquisition actually goes through - I was trying to understand what would this actually mean? Microsoft would now have acess to every single line of code - which if you come to think of it - it actually quite a frightening thought. Bloody scary!! All the insights into the code, everything, the options are pretty much endless. Yes of course there will be terms, stating what exactly they can do with all this data, what data they will have access to and what they will keep private. We are wary of big brother and our privacy - but entrusting all our code to a potential competitor?

Microsoft has traditionally been percieved as the arch-villian of opensource. But that has changed. Microsoft has become one of the biggest open source contributors in the world, largely because of the visual studio code but they also contribute a good number of other opensource projects. There is a culture change within Microsoft, where the direction has become opensource first, and if you don't do open source and you have to justify why this is not the case. I was personally was exposed to this transformation for a few days where I spent at the Microsoft mothership a couple of weeks ago. I participated in a number of briefings from several leading architects, project managers and product managers within the company and was actually pleasantly that they are becoming an open source company themselves.

So the consequences of such an acquisition are not yet clear to me. For the Github people I have to say "Good for you a huge exit, enjoy the fame, the glory that comes with being bought out by Microsoft". Whatever the numbers may be (two - five billion dollars) is not a small sum. For the rest of people in the world who are using Github this might be a difficult situation. There are not very many neutral places like Switzerland left around in the world and definitely not many neutral places like Github left around in the software world any more.

Everybody has an edge. They might not say that they have alterior motives, but it is all about providing revenue for your company. Not to mention what this edge will give Microsoft as a cloud provider that now has access to the biggest code repositry in the world and a huge developer base which can now tie in conveniently to Azure.. The conspiracy theories and reactions on social media - are really amusing...

Something to think about..

Let me ask you readers of my blog. If Microsoft were to acquire Github, would you continue storing your code in a Microsoft owned repository? Yes or no ?

Feel free to leave your comments and thoughts below.

2018-05-29

My commentary on Gartner’s Cloud MQ - 2018

As a true technologist – I am not a favor of analyst reports and in some circles Gartner is a dirty word – but since most of the industry swears by Gartner – I went over the report.

Here are my highlights…(emphasis is mine – not from the source

Most customers have a multicloud strategy. Most customers choose a primary strategic cloud IaaS provider, and some will choose a secondary strategic provider as well. They may also use other providers on a tactical basis for narrow use cases. While it is relatively straightforward to move VM images from one cloud to another, cloud IaaS is not a commodity. <- No shit Sherlock! Each and every cloud wants you using their services and ditching the competition which is why there will never be a standard across clouds.

Customers choose to adopt multiple providers in order to have a broader array of solutions to choose from. Relatively few customers use multicloud architectures (where a single application or workload runs on multiple cloud providers), as these architectures are complex and difficult to implement. <- Damn straight! 

Managing multiple cloud IaaS providers is challenging. <- Really????? 
Many organizations are facing the challenge of creating standardized policies and procedures, repeatable processes, governance, and cost optimization across multiple cloud providers. "Single pane of glass" management, seamless movement across infrastructure platforms and "cloudbursting" are unlikely to become reality, even between providers using the same underlying CIF or with use of portable application container technology. <- This is an interesting one … Everyone in the Kubernetes community will probably not agree – because that is exactly what so many organizations are hoping that Kubernetes will give them, their holy grail of Cloud Agnostic..

Note that the claim that an ecosystem is "open" has nothing to do with actual portability. Due to the high degree of differentiation between providers, the organizations that use cloud IaaS most effectively will embrace cloud-native management, rather than allow the legacy enterprise environment to dictate their choices.

"Lift and shift" migrations rarely achieve the desired business outcomes. Most customers who simply treat cloud IaaS like "rented virtualization" do not achieve significant cost savings, increased operational efficiency or greater agility. It is possible to achieve these outcomes with a "lift and optimize" approach — cloud-enabled virtual automation — in which the applications do not change, but the IT operations management approach changes to be more automated and cloud-optimized. Customers who execute a lift-and-shift migration often recognize, after a year, that optimization is needed. Gartner believes it is more efficient to optimize during the migration rather than afterward, and that customers typically achieve the best outcomes by adopting the full range of relevant capabilities from a hyperscale integrated IaaS+PaaS provider. <- This is exactly what all the cloud vendors are selling, Start with lift and shift – and then go all in

What Key Market Aspects Should Buyers Be Aware Of?
The global market remains consolidated around two clear leaders. The market consolidated dramatically over the course of 2015. Since 2016, just two providers — AWS and Microsoft Azure — have accounted for the overwhelming majority of the IaaS-related infrastructure consumption in the market, and their dominance is even more thorough if their PaaS-related infrastructure consumption is included as well. Furthermore, AWS is many times the size of Microsoft Azure, further skewing the market structure. Most customers will choose one of these leaders as their strategic cloud IaaS provider.

Chinese cloud providers have gone global, but still have limited success outside of the domestic Chinese market. The sheer potential size of the market in mainland China has motivated multiple Chinese cloud providers to build a broad range of capabilities; such providers are often trying to imitate the global leaders feature-for-feature. While this is a major technological accomplishment, these providers are primarily succeeding in their domestic market, rather than becoming global leaders. Their customers are currently China-based companies, international companies that are doing business in China and some Asia/Pacific entities that are strongly influenced by China. <- The Cchinese market is just this – a Chinese market

AWS

Provider maturity: Tier 1. AWS has been the market pioneer and leader in cloud IaaS for over 10 years.

Recommended mode: AWS strongly appeals to Mode 2 buyers, but is also frequently chosen for Mode 1 needs. AWS is the provider most commonly chosen for strategic, organization wide adoption. Transformation efforts are best undertaken in conjunction with an SI. <- This does not mean that you can’t do it on your own – you definitely can – but you will sweat blood and tears getting there.

Recommended uses: All use cases that run well in a virtualized environment.

Strengths

AWS has been the dominant market leader and an IT thought leader for more than 10 years, not only in IaaS, but also in integrated IaaS+PaaS, with an end-of-2017 revenue run rate of more than $20 billion. It continues to aggressively expand into new IT markets via new services as well as acquisitions, adding to an already rich portfolio of services. It also continues to enhance existing services with new capabilities, with a particular emphasis on management and integration.

AWS is the provider most commonly chosen for strategic adoption; many enterprise customers now spend over $5 million annually, and some spend over $100 million.<- No-one said that cloud is cheap – on the contrary. 
While not the ideal fit for every need, it has become the "safe choice" in this market, appealing to customers that desire the broadest range of capabilities and long-term market leadership.

AWS is the most mature, enterprise-ready provider, with the strongest track record of customer success and the most useful partner ecosystem. Thus, it is the provider not only chosen by customers that value innovation and are implementing digital business projects, but also preferred by customers that are migrating traditional data centers to cloud IaaS. It can readily support mission-critical production applications, as well as the implementation of highly secure and compliant solutions. Implementation, migration and management are significantly eased by AWS's ecosystem of more than 2,000 consulting partners that offer managed and professional services. AWS has the broadest cloud IaaS provider ecosystem of ISVs, which ensures that customers are able to obtain support and licenses for most commercial software, as well as obtain software and SaaS solutions that are preintegrated with AWS. <- which is exactly why they have been, are and will probably stay the market leader
                     

Google

Provider maturity: Tier 1. GCP benefits, to some extent, from Google's massive investments in infrastructure for Google as a whole.

Recommended mode
: GCP primarily appeals to Mode 2 buyers. <- Google is for the new stuff, don’t want no stinking old legacy

Recommended uses: Big data and other analytics applications, machine learning projects, cloud-native applications, or other applications optimized for cloud-native operations.

Strengths

Google's strategy for GCP centers on commercializing the internal innovative technology capabilities that Google has developed to run its consumer business at scale, and making them available as services that other companies can purchase. Google's roadmap of capabilities increasingly targets customers with traditional workloads and IT processes, as well as with cloud-native applications. Google has positioned itself as an "open" provider, with a portability emphasis that is centered on open-source ecosystems.<- hell yeah – they are the ones that gave the world Kubernetes 
Like its competitors, though, Google delivers value through operations automation at scale, and it does not open-source these proprietary advantages.

GCP has a well-implemented, reliable and performant core of fundamental IaaS and PaaS capabilities — including an increasing number of unique and innovative capabilities — even though its scope of services is not as broad as that of the other market leaders. Google has been most differentiated on the forward edge of IT, with deep investments in analytics and ML, and many customers who choose Google for strategic adoption have applications that are anchored by BigQuery.

Google can potentially assist customers with the process of operations transformation via its Customer Reliability Engineering program (currently offered directly to a limited number of customers, as well as in conjunction with Pivotal and Rackspace). The program uses a shared-operations approach to teach customers to run operations the way that Google's site reliability engineers do.

Microsoft

Provider maturity: Tier 1. Microsoft's strong commitment to cloud services has been rewarded with significant market success.

Recommended mode
: Microsoft Azure appeals to both Mode 1 and Mode 2 customers, but for different reasons. Mode 1 customers tend to value the ability to use Azure to extend their infrastructure-oriented Microsoft relationship and investment in Microsoft technologies. Mode 2 customers tend to value Azure's ability to integrate with Microsoft's application development tools and technologies, or are interested in integrated specialized PaaS capabilities, such as the Azure Data Lake, Azure Machine Learning or the Azure IoT Suite.

Recommended uses
: All use cases that run well in a virtualized environment, particularly for Microsoft-centric organizations.

Strengths

Microsoft Azure's core strength is its Microsoft heritage — its integrations (both current and future) with other Microsoft products and services, its leverage of the existing Microsoft ISV ecosystem, and its overall strategic importance to Microsoft's future. Azure has a very broad range of services, and Microsoft has steadily executed on an ambitious roadmap. Customers that are strategically committed to Microsoft technology generally choose Azure as their primary cloud provider. <- This! This is the primary reason why Microsoft has come up in the Cloud in the past few years and why they will continue to push hard on Amazon’s heels. They are the one and only Cloud provider with a complete and true hybrid story

Microsoft Azure's capabilities have become increasingly innovative and open, with improved support for Linux and open-source application stacks. Furthermore, many customers that are pursuing a multicloud strategy will use Azure for some of their workloads, and Microsoft's on-premises Azure Stack software may potentially attract customers seeking hybrid solutions. <- Having spent a few days a week or two ago on the Microsoft campus – this is unbelievably true, They are a completely different company – and for the better.


So there were significant changes from the number of participants from last year – many vendors were left out. Here are the changes in the Inclusion and Exclusion criteria – which probably caused the shift.

2018

Market traction and momentum. They must be among the top global providers for the relevant segments (public and industrialized private cloud IaaS, excluding small deployments of two or fewer VMs). They must have ISO 27001-audited (or equivalent) data centers on at least three continents. They must have at least one public cloud IaaS offering that meets the following criteria:

- If the offering has been generally available for more than three years: A minimum of $250 million in 2017 revenue, excluding all managed and professional services; or more than 1,000 customers with at least 100 VMs.
- If the offering has been generally available for less than three years: A minimum of $10 million in 2017 revenue, excluding all managed and professional services, as well as a growth rate of at least 50% exiting 2017.

2017

Market traction and momentum. They must be among the top 15 global providers for the relevant segments (public and industrialized private cloud IaaS, excluding small deployments of one or two VMs), based on Gartner-estimated market share and mind share

2018

Technical capabilities relevant to Gartner clients. They must have a public cloud IaaS service that is suitable for supporting mission-critical, large-scale production workloads, whether enterprise or cloud-native. Specific generally available service features must include: •   

- Software-defined compute, storage and networking, with access to a web services API for these capabilities.
- Cloud software infrastructure services facilitating automated management, including, at minimum, monitoring, autoscaling services and database services.
- A distributed, continuously available control plane supporting a hyperscale architecture.

- Real-time provisioning for compute instances (small Linux VM in five minutes, 1,000 Linux VMs in one hour) and a container service that can provision Docker containers in seconds.
- An allowable VM size of at least 16 vCPUs and 128GB of RAM.
- An SLA for compute, with a minimum of 99.9% availability.
- The ability to securely extend the customer's data center network into the cloud environment.
- The ability to support multiple users and API keys, with role-based access control.
                                                                       
The 2018 inclusion criteria were chosen to reflect the key traits that Gartner clients are seeking for strategic cloud IaaS providers, and thus reflect minimum requirements across a range of bimodal use cases.

2017

Technical capabilities relevant to Gartner clients. The public cloud IaaS service must be suitable for supporting production workloads, whether enterprise or cloud-native. Specific service features must include:

- Data centers in at least two metropolitan areas, separated by a minimum of 250 miles, on separate power grids, with SSAE 16, ISO 27001 or equivalent audits Real-time provisioning (small Linux VM in five minutes)
- The ability to scale an application beyond the capacity of a single physical server An allowable VM size of at least eight vCPUs and 64GB of RAM
- An SLA for compute, with a minimum of 99.9% availability
- The ability to securely extend the customer's data center network into the cloud environment
- The ability to support multiple users and API keys, with role-based access control
Access to a web services API

So I think it is really sneaky that Gartner changed their criteria – and probably what pulled most of the players out of the game was the part about containers and the rest of the managed services. Some were also probably pushed out because of the revenue. They are very much still alive – they are very much still making money but it is clear who is the boss, who is hot on their heels, and who is going to picking up the scraps.

Here are quadrants – year after year

Magic Quadrant for Cloud Infrastructure as a Service, Worldwide 

Image result for cloud magic quadrant 2017

Image result for cloud magic quadrant 2016

image

2018-04-30

To Be a 10x Engineer, or Not to Be

I am sure you are all familiar with those select few in your company who supposedly have super powers or hidden technological gifts.

Yes, I mean those co-workers who know exactly how to fix the most esoteric issues that no one has come across, ever. Perhaps you share a cubicle with the guy who’s able to conceive of breakthroughs time after time after time. Or maybe you were hired the same month as the woman who can code like no one has coded before, and it’s like reading poetry — smooth, with meaning, and plays on your most inner emotions. Or you’re the manager who hired the weird dude that sits in the basement, who can be woken up at 2:43 am on a Sunday morning after partying endless hours the night before, and still be able to drone off the precise sequence of events that you need in order to prevent your production NoSQL database from exploding … because someone forgot to run maintenance.

Continue reading the rest of the article

2018-04-09

Time for a New Chapter - Hello CyberArk!


A bit of history

After 13 years at Cisco - I have decided to challenge myself and embark on a new adventure.

I first would like to express my gratitude to those who have helped me grow over the years.

Starting out 13 years I was part of the helpdesk at a company called NDS (who was acquired by Cisco about 5 years ago), Supporting users over the phone, servicing desktops and laptops.

From there I moved to the systems group and managing Active Directory and assumed additional responsibilities.  Over the years I architected and deployed one of the largest VMware deployments in Israel, and continued to grow with the technologies within the company, grow professionally to where I am today.

The people I have had the honor to work with over the years, are the greatest resource I will take with me for the future and the one I will miss the most. The crazy projects we pulled off, the outrageous ones that sometimes did not - these are things that I will always cherish.

From every experience over the years, I have learned something new, and it has allowed me to grow. For that I am eternally thankful.

Why am I leaving Cisco?

13 years is a long time to stay at one company and it is time for a change, time for bigger challenge. Cisco has allowed me to grow immensely, pivot to new technologies over the years and play with stuff day in and day out. Leaving was a hard decision, because change is a scary thing, scary for me, scary for anyone. I know the people, I know the company, I know the ropes.

In spite of all this - I needed a change, an opportunity to explore new technologies, new areas of interest


change-948016_1280

Hello CyberArk!

Starting from Monday, April 16th, I will be assume the position of DevOps & Cloud Architect at CyberArk.    I am really excited to start this new journey.

CyberArk is the only security company laser-focused on striking down targeted cyber threats, those that make their way inside to attack the heart of the enterprise. Dedicated to stopping attacks before they stop business, CyberArk is trusted by the world’s leading companies — including more than 50% of the Fortune 100 — to protect their highest-value information assets, infrastructure and applications.

I will continue to be involved in AWS, branching out in to additional cloud providers as well, and focused on allowing CyberArk to expand their offerings to allow their customers the choice of running a best of breed solution in the location of their choice, and some new and innovative ways of securing their organization and resources in the cloud as well.

(I know that I have neglected this blog for a good part of a year (for a number of reasons) - something that I am going to rectify starting next week)

I am so excited, and hyped to start this new chapter !!!

2018-01-30

5 #AWS Certifications in 237 days

Today I completed my 5th AWS Certification. Something which I had hoped to complete before the end of 2017, but life got in the way.

I started working dabbling with Screenshot at Jan 29 20-48-24AWS a while ago - signed up for a free account at way back in the end of 2014 and started to play with it, but not too much.

It was  not until the beginning of 2017 that I really went in full force. When I declared my goals for Q1 2017 one of them was to work on AWS.  I am happy to say that I have accomplished this goal - this is what I do all day, every day.

So during the past year I decided to also pursue the AWS certification track, mostly to prove to myself that I could, but also to learn more about AWS, Their products and solutions

Here is my journey.

Solutions Architect - Associate

This was the first one I did. It took me almost six months since starting to use AWS to feel comfortable with my knowledge to go for the exam.

I used the A Cloud Guru course - which was great for this certification, Just the right amount of content - of course you need to know what you were doing in AWS (at least a little bit). Iwent over the lessons, did the quizzes, took the practice exam on the AWS certification site (and failed). I read the whitepapers (yes – all of the suggested whitepapers). I read the FAQ's for SNS, SQS, S3, EC2. Whatever I did not feel comfortable with – I went over in the actual AWS console – and learned how to use it.

Trying to remember all the options is not realistic – but you will be able to eliminate the really stupid options that are in the test.
 
Taking a test on a new technology is always scary - this was as well.

80 minutes. 60 questions.

Read the questions properly – even if you are not sure – there are some really obvious answers that blatantly incorrect – so that will help you eliminate the noise from the question.

This is an entry level exam - which expects you to understand that AWS concepts - and some cases - how to use them.

SysOps Administrator - Associate

Two weeks later. I sat the next exam.The amount of overlap with the previous exam is astounding. Again I used the A Cloud Guru course - but here I skimmed over the videos - as they were repetitive from the previous exam. There was a bit more emphasis on the Cloudwatch and logging aspect of the certification - but nothing too deep.

80 Minutes. less

Onwards and upwards.

Solutions Architect - Professional

Everything I had heard and read about this exam was the same - it is a beast. Damn, bloody hard. Not only do you need to have to understand the AWS services, how they work and when you should use them - but more importantly - you have to know when they will not be a good fit for the particular scenario.

It took me almost 6 weeks to prepare for this one.

The A Cloud Guru course was a waste of time. Not enough depth – the quizzes and practice questions are superficial – and it is mostly a re-hash of the previous Sysops Associate and Solutions Architect Associate – with maybe two-three lectures added in. I went through the whole thing until I realized that it was not enough.
 
I took a practice test available from AWS (everyone that passes a previous exam gets a free voucher – I found it a good practice (40 questions – with a time limit) although the questions on the practice exam were MUCH harder than the real thing and I failed the practice exam.
 
I received recommendations to do the CSA-Pro course from Linuxacademy.com – first 7 days are free – and then $29 / month – I used the subscription for one month to go through the whole course.
 
The lecturer on the course – speaks SO slowly – that it can be really annoying – but luckily – you can speed up the lectures to x1.5 which makes it a lot better. The course is long – the walkthroughs are excellent! And the labs are also really good and give you some hands on with the features discussed - if you have not used them before. They have a good practice exam – 3 hours 80 questions – you can take it multiple times – but the pool of questions or almost exactly the same.
 
The most important part of this is the option to practice sitting on your butt for 3 hours concentrating on the exam – it is crucial to prepare yourself mentally for an ordeal – this is one of biggest challenges in the exam.
 
The blueprint says you have to be an architect with at least 1-2 years experience on AWS (well we all know that we have not more than 6-8 months)
 
The test is a different level completely from the associate exam. A lot more detail. The questions are scenarios about how you combine multiple products within AWS and create a robust/cost effective/quick solution for each scenario. Not only do you need to know what each product can do – but more importantly – you need to know what each CANNOT do. The answers are similar enough to cause problems and you need to pay attention to the details.

So a little bit about the exam.
 
170 minutes (yes almost 3 hours)
77 questions
 
I went through all the questions within about 140 minutes.The two biggest hurdles in the (IMHO) are time and being able to concentrate for almost 3 hours straight.You do not have a lot of time to spend on each question (just above 2 minutes per questions) and some of them are long. So you have to be able to read the question – filter out the nonsense and noise in each question (and there is enough of it) and zoom in on what they are looking to find.
 
Most questions had at least 1-2 answers that I could disqualify off the bat – which makes it easier to focus on the ones left.
 
There were 3-4 questions where I had absolutely no idea what the correct answers were. I chose one of the answers and marked it so I could return to it later. If there were questions where I was uncertain between two of the choices – I marked them down as well – and jotted down on the paper provided by the testing center – which were my possibilities. Most questions were to choose a single answer – there were a few choose X answers in the test as well – but not many.
 
For the last 30 minutes – I went back to the questions I had marked as "no idea" – re-read them – and made an educated guess by eliminating the obvious wrong answers.The rest of the question that I marked as not sure about – re-checked the options I marked down – and confirmed my best choice.
Questions that I was sure I knew the answer to – I did not even go over.
 
I finished my review with 1 minute to spare… (169 minutes)

All in all – a fair but difficult exam – gruesome but fair… Glad to be past it

And then I went on a summer vacation with the family. Time to clear my head, chill and forget about AWS for a while.

Developer - Associate

I was a bit scared about this one - I must say. I am not a developer. Never have aspired to be. I dabble in code and can write a script with the best of them - but a development centric certification - I did not look forward to.

I used the A Cloud Guru course here. There was a lot of overlap with the previous 3 exams, but akso a lot of new stuff that I was not acquainted with - such as Cloudformation, DynamoDB, RDS, Elastic BeanStalk and such.

The exam was not difficult - it is an entry level exam.

Then came the Jewish Holidays, Re:Invent and the first 3 chapters of my book The Cloud Walkabout.

DevOps Engineer - Professional

I must have re-scheduled this exam at least 6 times, really six times - because I felt I was not ready.

As a professional level certification - I was expecting hell like the Solutions Architect. Expecting that you have to know things in detail - a lot more detail than the Associate exam (and I was right).

The A Cloud Guru course was again to shallow. There are things in the blueprint that I do not use in my daily work, some of them - I have never even touched before, and I found the content to be too shallow, not deep enough for what I was expecting to see in a Pro exam.

The Linux Academy course was much better, again the instructor was a bit too slow for my taste (super speed helps though) and the practice exam was quite good, although 70% of the questions on my exam - i had never seen before.

You need to know how a developmentp pipeline works - I mean really works, blue/green deployments, rolling upgrades, CloudWatch, AutoScaling in and out, CloudFormation, Elastic Beanstalk, OpsWorks are some of the in-depth topics you need to know.

The exam was not completely scenario based - but more about technical details on some of the products. It is 80 questions in 3 hours - so you have to manage your time - but nowhere close to the pressure on the Solutions Architect - Professional exam.

And lo and behold…

My Closing Thoughts

  1. There is a lot of information that is really outdated in the AWS exams, all the new shiny stuff, like Lambda, ECS, Kubernetes etc. is not there - maybe a small reference here and there - but no real knowledge of the new stuff. There is stuff in there - that no-one or hardly anyone uses.
  2. I have learned a huge amount over these last 6 months, both by reading , listening watching videos, and lectures and by doing - mostly by doing than anything else.
  3. These are not paper certs, you cannot pass with only reading or going over braindumps or cheatsheets. You need to actually use the products, understand how they work, and where they fit in the overall picture.
  4. I did not take a single AWS course - I learned everything on my own. I have always been a self learner and prefer to play with the tech myself that spend the money on an official course.
  5. The cost of the certification exams is not cheap, in total $1,050 for the 5 exams (PRO exams are $300 a pop) and luckily I had the costs covered by my employer - which made it easier - but also more pressure to pass - ROI you know.

What’s next - I don’t know.. Specialty exams (Networking, Big Data, Security)? Perhaps - I am not sure.

I do know that there are not many people that have the AWS Professional certifications in Israel, and I am pretty sure that I can count the the number of people in Israel with all 5 on one or two hands.

If you are looking to prove your knowledge and expertise in AWS - then go for it. It is possible - it takes time, commitment and support from your surroundings, but it can be done.

I am proud of my achievement and hope this post will give you the motivation to go out and learn something new.

As always, feel free to leave your thoughts and comments below.

2018-01-27

Kubernetes Is Not the Silver Bullet

Does the following sound familiar to you?

The industry latches on to a new technology and everyone falls under its spell, a spell that makes them think this latest technology will solve any and all of the problems we have suffered from in the past.


The Evolution of Illusion
I experienced this phenomenon when our IT department first discovered blades. It would solve all our problems, everyone said, cabling, cooling, power, and real estate. And, at first, that seemed true; that is until it brought with it a whole new set of problems, such as insufficient bandwidth, network contention, and congestion.

Then came virtualization and VMware. Better utilization! Faster time to delivery! Consolidation! But… we soon was revealed a whole new set of problems, like insufficient disk throughput, greater blast radius when a single server goes down, not to mention VM sprawl and increased licensing costs.

Read the rest of the blog at the source..

2018-01-25

The #AWS PowerShell Docker Container

I cannot believe it is over 3 years since I created the openstack-git-env container. At the time I was really frustrated at how hard it was to get started with setting up an environment  to start contributing to OpenStack.

Well I have now moved on - focused primarily on AWS - and I have a good amount of PowerShell experience under my belt - but since I moved off a Windows laptop 3 years ago - I hardly use PowerShell anymore. Which is a shame.

Luckily Microsoft have released a version of PowerShell that will work on Mac and Linux - so I can start getting back on the horse.

I looked at the instructions for setting up PowerShell command for AWS - which led me to the AWS documentation page. But the missing link there - is how do you install PowerShell on your Mac/Linux machine - there is no documentation there. This is complicated ands error prone.

So I was thinking - there must be a container already available for PowerShell - it can’t be that everyone goes through the hoops of installing everything locally.

And lo and behold - there is one - https://hub.docker.com/r/microsoft/powershell/

So I built on top of this - the AWS PowerShell container.

All you need to do is set an alias on you machine, add a script that will launch the container - and Bob’s your uncle - you are ready to go.

All the information is located on the repository.

Screenshot at Jan 25 08-59-05

Please let me know if you think this is useful - and if there are any improvements your would like to see.

The code is on Github - feel free to contribute or raise any issues when/if you find them.