2019-06-12

Starting a new Journey #AWS

Simon Sinek has a great talk - about how great leaders inspire great action. I learned something really important from this talk even through it is almost 10 years old.






By explaining things in the wrong way - we miss the opportunity to make a great impact, to change the world.
  1. We usually start with the What.
  2. Then the How..
  3. And only at the end - we get into the Why...

It should be the other be the reverse.

Following Simon's advice I will start with the why..

How?  Why?  What?

Why?

I firmly believe that the future is the public cloud. I believe that we can accomplish so much more, so much faster, when we leave heavy lifting for others. This allows all of us to focus on providing the actual value to our customers without having to worry about the underlying infrastructure.

I know that I have a huge amount still to learn, but I also have a huge amount of knowledge, experience and insight that I can share with others. I have been doing this for many years, and see this not only as a way to put bread on the table, but also a way to make a real change in the world.

I want other people to benefit from what I have to give.


How?

I work with teams on how to start their journey to the cloud, how to make use of the technologies available to them. This includes, writing code, continuously learning (myself included), gaining more knowledge, and ultimately sharing that knowledge with others. I have built pipelines, migrated workloads into the cloud, failed miserably in some cases, continuously improved and iterated to get better the whole time.

Working on a regular basis with customers to help them on their journey, through their challenges along the way, celebrate their success stories with them, experience the pain and anguish with their failures / disasters - but above all - to be an advisor for my clients - with their best interests in mind.


What?

The change I have decided to embark on (and the challenge I have decided to accept) is moving my skills and energy in a direction where I feel I can make even more of an impact, help more people, help even more organizations, and not only focus on a single company, but make even a bigger impact.

Starting July 15th I will joining Amazon (Web Services) as a Senior Solutions Architect.

I will be working with an amazing team of solutions architects and talented people in a company that I really believe can change the way we use technology, make it better, more efficient, and do amazing things.

My last day at CyberArk will be June 30th, then I go on a long deserved and well earned vacation for two weeks.

I have learned a huge deal at my time here at CyberArk, worked with amazing people, learned a lot about the security space, their challenges, their fears, their constraints. None of it is easy. It is not a cloud native world and the problems this industry faces are not easy ones to solve, especially in what could be termed as "legacy" environments. For all this knowledge, the insight and experiences over the last year - I am extremely grateful.

I cannot wait for day 1 on July 15th!!!

2019-06-10

Book Review: Mastering AWS Cost Optimization

I dabble in AWS every now and again :) and a new book just came out - so obviously I wanted to go through it and give it a read.

Mastering AWS Cost Optimization: Real-world technical and operational cost-saving best practices
(Eli Mansoor and Yair Green)



So first some disclosure - I have met with Eli a few times throughout my career - we had some business discussions during his Rackspace days. Eli Reached out to me and asked me to read the book and post a review. 

I received a free paperback copy. 

I finished the book in two days, (it was chag and I had a lot of time to read). It quite clear that a lot of knowledge and detail went into the writing of the book. 

Eli and Yair took a methodological approach throughout the book. They focused on three main aspects of your AWS cost (with a strong emphasis on Compute and Storage, but also Networking).

They used a methodology which they name KAO (Knowledge, Architecture, Operation) which in my honest opinion provided a logical and clear flow for the book and made it an easy read. 

KAO Methodology
They go into detail on how each of the services are used - sometimes in really great detail. There are a number of examples in the book used to explain how things exactly work. 

The last section of the book is focused on Operations with different suggestions and recommendations on how to adjust you current practices, to become more "cost aware/optimized". I for one would have preferred that this section would have been more of the focus of the entire book - but that could just be me. There are a great number of gems in this section - that has something new for everyone (me included!!)  

As I said, this was an easy read, well structured and very informative. Eli and Yair have done a great job, diving deep on a topic that is important to us all (but has no real good source of information - besides experience) - but have also left enough space to expand on this book and to provide a more detailed deep dive and focus on more specific subjects in the future.

I would definitely give it a read!

2019-05-27

(Not) Real Scientific Proof that AMI has #3syllables

AWS has 26, (yes) I counted them, different products with exactly 3 letters in them (or derivatives of) - lets go through them one at a time.


  • A-C-M AWS Certificate Manager - Is not pronounced ac-em (also not hack-em) 
  • D-M-S Database Migration Service - Is not pronounced dems nor dee-miss (and also not dimms)
  • E-B-S Elastic Block Store - Is not pronounced ebbs (and we are not being washed back out to sea), nor ee-bzz (people might be allergic to bees) 
  • E-C-2 (Well it should actually be E-C-C - but EC2 sounds so much sexier) Elastic Compute Cloud - Is not pronounced ek-2 (or even eck - otherwise people might get confused with "what the heck2")
  • E-C-R - Elastic Container Registry - Is not pronounced Ecker-R (sounds too much like pecker) 
  • E-C-S - Elastic Container Service - Is not pronounced eh-ckes neither ee-cees nor Ex (People would be wary to use a product named Amazon X - they might think that AWS is taking after Google with their Alphabet) 
  • E-F-S - Elastic File System - Is not pronounced ef-s neither ee-fees nor eefs
  • E-K-S - Elastic Container Service for Kubernetes - pronouncing this x-kay (ECS-K) would sound too much like Xray (another AWS product). Also see above about E-C-S 
  • E-M-R Elastic MapReduce - We don't call it ee-mer - nor emmer (otherwise all the Dutch people might think that this is an S3 look-alike) 
  • F-S-X - I can't find what this stands for - except for FSx :) - not ef-sex (that is not politically correct..) 
  • I-A-M - Identity and Access Managment - no-one uses I-AM - (Dr. Suess would be happy with I-AM-SAM - SAM-I-AM
  • I-O-T - Internet Of Things - Not eye-ot (people might think there are more than 7 dwarfs in the service - eye-o, eye-o it's off to work we go..) 
  • K-M-S Key Management Service - Is not pronounced kems - nor kee-mes (keemes - the new AWS meme-as-a-service product is probably not a good idea either) 
  • L-E-X - this is actually the product name - Amazon Lex - even though the French might have enjoyed it if it was actually Le'X (but then again people don't like having their Ex in the spotlight) 
  • M-S-K - Managed Streaming for Kafka - Is not pronounced musk (Elon might not like it), em-sek (could be too fast for us to use). And of course AWS had to name a product after me.
  • P-H-D - Personal Health Dashboard - Is not pronounced pee-hud and phud - would get them in trouble with spreading Fear Uncertainty and Doubt
  • R-A-M - Resource Access Manager - Not (a battering) ram (nor the the ancient Indian king Raam
  • R-D-S - Relational Database Service - Is not pronounced ar-dis, nor ar-dees (and definitely not the new time machine service - tardis) 
  • S3 - Simple Storage Service - This is a 3 letter product - S-S-S (S3 is so much sexier) - Not sss (people might think there are snakes) - here I conceded - ess-ess-ess brings up really bad vibes 
  • S-E-S - Simple Email Service - Is not pronounced Sess nor sees (otherwise us customers might think this is a new tax in eu-west-1 or ap-south-1) 
  • S-N-S - Simple Notification Service - Is not pronounced S-ness, neither sneeze nor Sans (and not nessie either - she is still somewhere in the Loch) 
  • S-Q-S - Simple Queue Service - Is not pronounced see-ques - nor squeeze 
  • S-S-O - Single Sign On - Is not pronounced sa-so neither ses-o nor se-so (just because I say so) 
  • S-W-F - Simple Workflow Service - Is not pronounced see-wiff - nor Swiff 
  • V-P-C - Virtual Private Cloud - Is not pronounced vee-pic, neither ve-peec nor veep-see 
  • W-A-F - Web Application Firewall - I concede - this one is #1syllable - there I said it! BUT IT IS NOT #2syllables !!

Except for three exceptions (S3, LEX and WAF) - all the three letter products in AWS - are all pronounced with three syllables!!!!

Just like A-M-I - which has #3syllables 

I rest my case. 

2019-03-18

The #AWS EC2 Windows Secret Sauce

Now that I have got your attention with a catchy title - let me share with some of my thoughts regarding how AWS shines and how much your experience as a customer matters.

Deploying instances in the cloud is something that is relatively fast - at least when it comes to the deployment of a Linux instance.

Windows Operating Systems - is a whole different story.

Have you ever thought why it takes such a long amount of time to deploy a Windows instance in the cloud? There are a number of reasons why this takes so much longer.

Let me count the ways:
  1. Running Windows in the cloud - is a dumb idea - so you deserve it!! (just kidding :) ) 
  2. Seriously though - Windows images are big - absolutely massive compared to a Linux image - we are talking 30 times larger (on the best of days) so copying these large images to the hypervisor nodes takes time.
  3. They are slow to start.. Windows is not a thin operating system - so it takes time. 
With all the above said - it seems that AWS has created a really interesting mechanism with which they can reduce the amount of time it takes for an instance to start. Yes they say it can take anything up to 4 minutes for you to be able to remotely connect to the instance - but if you think about it - that is really a very short amount of time.

I started to look into the start time of Windows (for a whole different reason) and found something really interesting.

This is not documented anywhere - and I doubt I will receive any confirmation from AWS on the points in this post - but I am pretty confident that is the way this works.


It seems that there is a standby pool of Windows instances that are just waiting in the background to be allocated to a customer - based on customer demand.

Let that sink in for second, this means there is a powered-off Windows instance - somewhere in the AZ waiting for you.

When you request a new Windows EC2 instance, an instance is taken from the pool and allocated to you. This is some of the magic sauce that AWS does in the background.

This information is not documented anywhere - I have only found a single reference to this behavior on one of the AWS forums - Slow Launch on Windows Instances

forum_post_slow



I did some digging of my own and went through the logs of a deployed Windows instance and this provided me with a solid picture of how this actually works. This is what I have discovered about the process (with the logs to back it up).

The date that this was provisioned was the 17th of March.
  1. On the 17th I launched a Windows instance in my account at 13:46:41 through the EC2 console.

    ec2_launch
  2. You can see that AWS does not make the instance available for about 4 minutes - until then you cannot login

    (have you ever wondered why?? - hint, hint carry on reading.. )

    4_minutes
  3. After waiting for just under 4 minutes I logged into the instance and from the Windows event log - you will see that the first entry in the System log is from February 13th at 06:52 (more than a month before I even requested an instance).

    This is the day that the AMI was released.

    1st_boot
  4. At 06:53 that same day the instance was generalized and shutdown

    sysprep

    shutdown
  5. The next entry in the log was at 04:55 on the 17th of March - which was just under
    8 hours before I even started my EC2 instance!!

    start_in_pool

  6. The hostname was changed at 04:56

    rename_generalize
  7. And then restarted at 04:57

    reboot_generalize
  8. After the instance came back up - it was shutdown once more and returned to the pool at 04:59.

    shutdown-return-to-pool.

    shutdown-return-to-pool2
  9. The instance was powered on again (from the pool) at 11:47:11 (30 seconds after my request)

    power-on-from-pool

    More about what this whole process entails further on down the post.

  10. The secret-sauce service then changes the ownership on the instance - and does some magic to manipulate the metadata on the instance - to allow the user to decrypt the credentials with their unique key and allow them to log in.

    ssm_agent
  11. The user now has access to their instance.

I wanted to go a bit more into the entity that I named the "Instance Pool". Here I assume that there is a whole process in the background that does the following  (and where the secret sauce really lies).

This is is how I would assume how the flow would be:


There are two different entities here at work - one is the AWS backbone service (in orange) and the User/Customer (in blue). Both of the sequences work in parallel and also independent of each other.

  • AWS pre-warm a number of Windows instances in what I named the "Instance pool". They preemptively spin up instances in the background based on their predictions and the usage patterns in each region. I assume that these instances are constantly spun up and down on a regular basis - many times a day.
  • A notification is received that a customer requested an instance from a specific AMI (in a specific region, in a specific AZ and from a specific instance type  - because all of these have to match the customers request).
  • The request is matched to an instance that is in the pool (by AMI, region, AZ, instance type)
  • The instance is then powered on (with the correct modifications of the instance flavor - and disk configuration)
  • The backend then goes and makes the necessary modifications
    • ENI allocation (correct subnet + VPC)
    • Account association for the instance
    • Private key allocation
    • User-data script (if supplied) 
    • Password rotation
    • etc.. etc..
I know that this sounds simple and straight forward - but the amount of work that goes into this "Instance Pool" is probably something that we cannot fathom. The predictive analysis that is needed here to understand how many instances should be provisioned, in which region, in which AZ - is where AWS shines and have been doing so for a significant amount of time.

This also makes perfect sense that when you deploy a custom Windows AMI - this process will not work anymore, because this is a custom AMI and therefore the provisioning time is significantly longer.

And all of this is done why?

To allow you to shave off a number of minutes / seconds wait time to get access to your Windows instance. This is what it means to provide an exceptional service to you the customer and make sure that the experience you have is the best one possible.

I started to think - could this possibly be the way that AWS provisions Linux instances as well?

Based on how I understand the cloud and how Linux works (and some digging in the instance logs) - this is not needed, because the image sizes are much smaller and bootup times are a lot shorter as well, so it seems to me that this "Instance Pool" is only used for Windows Operating systems, and only for AMI's that are owned by AWS.

Amazing what you can find from some digging - isn't it?

Please feel free to share this post and share your feedback on Twitter - @maishsk

2019-03-11

The Anatomy of an AWS Key Leak to a Public Code Repository

Many of us working with any cloud provider know that you should never ever commit access keys to a public github repo. Some really bad things can happen if you do.

AWS (and I assume all the cloud providers have their equivalent) publish their own best practices about how you should manage access keys.

One of the items mentioned there - is never to commit your credentials into your source code!!

Let me show you a real case that happened last week. 
(of course all identifiable information has been redacted - except for the specific Access key that was used - and of course it has been disabled)

Someone committed an access key to a public github repository. 

Here is the commit message 

commit xxxxxxxx26ff48a83d1154xxxxxxxxxxxxa802
Author: SomePerson <someone@some_email.com>
Date:   Mon Mar 4 10:31:04 2019 +0200

--- (All events will be counted from this point) ---

55 seconds later - I received an email from AWS (T+55s)

From: "Amazon Web Services, Inc." <no-reply-aws@amazon.com>
To: john@doe.com
Subject: Action Required: Your AWS account xxxxxxxxxxxx is compromised
Date: Mon, 4 Mar 2019 08:31:59 +0000

1 second later (T+56s) AWS had already opened a support ticket about incident




Just over 1 minute later (T+2:02m) someone tried to use the key - but since the IAM role attached to the user (and its exposed key) did not have the permissions required - the attempt failed!!

(This is why you should make sure you only give the minimum required permissions for a specific task and not the kitchen sink..)

Here is the access attempt that was logged in Cloudtrail




Here is where I went in and disabled the access key (T+5:58m)



Here was the notification message I received from GuardDuty which was enabled on the account (T+24:58m)

Date: Mon, 4 Mar 2019 08:56:02 +0000
From: AWS Notifications <no-reply@sns.amazonaws.com>
To: john@doe.com
Message-ID: <0100016947eac6b1-7b5de111-502d-4988-8077-ae4fe58a87c9-000000@email.amazonses.com>
Subject: AWS Notification Message



Points for Consideration

There are a few things I would like to point out regarding the incident above (which we in the categorized to one of a low severity). 

  1. As you can see above the first thing that the attacker tried to do was to run a list keys. That would usually be the first thing someone would try - to try and understand which users are available in the system (assuming that the user has the permission to perform that action)

    You can read more about how a potential hacker would exploit this in this series of posts.

  2. I assume since the attacker saw that they do not have enough permissions - they decided this was not a worthy enough target to continue to try the exploit. Why waste the time if you are going to have to work really hard to get what you want. That is why we only saw a single attempt to use the key.

    If I was the hacker - I would just wait for the next compromised key and try again.

  3. The reason this attack was not successful - was because the role attached to the User (and its access keys) was built in such a way that they did not have permissions to do anything in IAM.

    This was by design. The concept of least privilege is so important - and 10 times more when you are working in the cloud - that you should implement it - in every part of your design and infrastructure.

  4. AWS responded extremely fast - that is due to them (I assume) scraping the API of all public github commits (for example). It could have been that I was just in time for a cycle - but based on my past experience - the response time is usually within a minute. It would be great if they could share how they do this and handle the huge amount of events that flow through these feeds.

    They still have to match up the exact compromised key to the account, and kick off the automatic process (email+ticket). All of this was done in less than 60 seconds.

    I am impressed (as should we all be).

  5. One thing I do not understand is that why AWS would not immediately disable the key. The business implications of having a key out in a public repo - are so severe - and the  use case that would require a key in the open - is something that I cannot fathom as being a valid scenario. If AWS already find a compromised key, know which account it belongs to, and kick off a process - then why not already disable the key in the process??

    The amount of time and work that AWS would have to invest (in support tickets and calls) working with a customer to clean up the account, forfeit the charges incurred because of the leak - are above and beyond anything they would incur by automatically disabling the key in the first place.

    AWS has started to take a stance on some security features - by disabling thing by default (for example - public S3 buckets) to protect their customers from causing harm to themselves.

    I for one would welcome this change with open arms!



  6. It took me over 5 minutes to actually act on the exposed credential - in 5 minutes, a malicious actor can do some real and serious damage to your AWS account.

  7. GuardDuty - was slow, but it obvious why this was the case. It takes about 15 minutes until the event is delivered to CloudTrail - and GuardDuty then has to analyze based on previous behavior. So this product should not be used for prevention - but rather - for forensic analysis after the fact. There is no real way to identify this data on your own and analyze against your baseline for behavior - so this product is in my honest opinion still very valuable.

  8. How does one stop this from happening?

    There are a number of ways to tackle this question.

    In my honest opinion, it is mainly raising awareness - from the bottom all the way to the top. The same way people know that if you leave your credit card on the floor - there is a very good chance it will be abused. Drill this into people from day 1 and hopefully it will not happen again.

    There are tools that are out there - that you can use as part of your workflow - such as
    git-secrets that prevent such incidents from even happening - but you would have to assure that every single person, and every single computer they ever work on - would have this installed - which is a much bigger problem to solve.

    Install your own tools to monitor your repositories - or use a service such as GitGuardian that does this for you (not only for AWS - but other credentials as well). 
As always please feel free to share this post and leave your feedback on on Twitter @maishsk

2019-03-06

My awesome-podcasts List

I have a decent commute every day back and forth to work and I have come to enjoy listening to a number of podcasts throughout the week.

I will try and keep the list up to date - here

As of today - this is my current list of podcasts


Grumpy Old Geeks

Two old farts (like me) that bitch about tech, and how ridiculous we have all become - Link

AWS Podcast

A weekly show about what is happening in the world of AWS - Link

The Cloud Pod

A podcast about what is going on the cloud - Link

Screaming in the Cloud

Conversations about the cloud with people - Link

PodCTL

Podcast about Kubernetes - with a RedHat focus - Link

The Cloudcast

Podcast about all things Cloud - Link

Cloudtalk (Hebrew)

Hebrew Podcast about the world of cloud - Link

The Tony Robbins Podcast

Inspirational talk with Tony Robbins - Link

Datanauts (Packet Pushers)

Podcast about tech, cloud and all things nice - Link

Rural Emergency Medicine Podcast

A Podcast about emergency medicine - Link

Speaking in Tech

Podcast about things happening in the tech world - Link

The Secure Developer

Security focused Podcast - Link

The Full Stack Journey

Interviews with people that have made a change in their technical career - Link

To Be Continuous

DevOps focused podcast - Link

The Microsoft Cloud Show

A Microsoft focused cloud podcast - Link

Emergency Medicine Cases

A podcast about emergency medicine - Link

Techtalk

A podcast in Hebrew about the cloud and tech - Link

2019-03-04

AMI has 3 Syllables. A.M.I. #AWS

Just to make this clear

(before someone get's the wrong idea...)

This 100% fun. Humor.

Not religion. Not a mission.

Just having some fun at the expense of AWS..


If you follow me on Twitter (and if you don't - your loss..) then you will know that I am one of many that are on a crusade.

A crusade to right a wrong.

A wrong that some who work in a company called Amazon Web Services (a.k.a. AWS) have tried to indoctrinate the world with a lie, something that is just plain wrong.

And the crusade about I speak - is the religious debate about how you pronounce AMI
(Amazon Machine Image)

You will find many references to this over the past few years:

Twitter Thread
Last Year in AWS
Last week in AWS - Issue #35
Another Twitter thread
And another
And yet another
Abby Fuller's post
This recording


And of course the one and only Corey Quinn

I decided that I cannot idly stand by and let this injustice continue.

I took a step. I took a stand (and I started with a donation of 2 Euro for the domain name)


AMI has 3 syllables


http://ami-has-3-syllables.online

And in my ramblings back and forth with Corey - he enlightened me to the following fact
(which is so unbelievably true)

I managed to release the perfect AWS product (on a budget of $2 - really proud of myself)

Perfect launch


1. People have no idea how to use it
2. It has a stupid name that you cannot remember
3. The graphics suck... (sorry I have not done HTML/CSS in - I do not know how long)
4. No TLS

So in the spirit of this perfect release - I thought about how this would work with a real AWS product launch, and therefore I will iterate over time to improve the product.

Here is the plan (in the reverse order from above)..
  1. Implement TLS ( I actually could do that today - but I am going leave like this for the launch in the spirit of a new product)
    ** Edit ** - Implemented 05 March, 2019
  2. Fix up the graphics
    (Here I am going to crowdsource and look to you all - and if anyone wants to step up and improve my crappy artwork - reach out - I would be happy to get some help.
    Feel free to reach out on to me @maishsk)
  3. Plug the name to death - until people remember the name - in their sleep
    For this - say hello to @3_syllables (feel free to follow)
  4. Implement a bot that will interact with people who don't know how to pronounce A.M.I.
    (and maybe add some statistical functionality on the bot's activity to the site)



Feel free to share - and leave me your thoughts on Twitter

2019-02-25

Goodbye Docker and Thanks for all the Fish

Back in July 2018, I started to write a blog post about the upcoming death of Docker as a company (and also perhaps as a technology) but I never got round to completing and publishing the post. It is time to actually get that post out.





So here you go....


Of course Docker is still here, and of course everyone is still using Docker and will continue to do so the near and foreseeable future (how far that foreseeable future is - is yet to be determined). The reason I chose this title for the blogpost is because, in my humble opinion the days for Docker as a company are numbered and maybe also a technology as well. If would indulge me with a few minutes of your time - I will share with you the basis for my thoughts.

A number of years ago - Docker was the company that changed the world - and we can safely say - is still changing the world today. Containers and the technology behind containers has been around for many years, long before the word docker was even thought of, even turned into a verb (“Dockerize all the things”), but Docker was the company that enabled the masses to consume the technology of containers, in a easy and simple fashion. Most technology companies (or at least companies that consider themselves to be a modern tech company) will be using Docker or containers as part of their product or their pipeline - because it makes so much sense and brings so much benefit to whole process.

Over the past 12-24 months, people are coming to the realization that docker has run its course and as a technology is not going to be able to provide additional value to what they have today - and have decided to start to look elsewhere for that extra edge.

Kubernetes has won the container orchestration war, I don’t think that anyone can deny that fact. Docker itself has adopted Kubernetes. There will always be niche players that have specific use cases for Docker Swarm, Mesos, Marathon, Nomad - but the de-facto standard is Kubernetes. All 3 big cloud providers, now have a managed Kubernetes solution that they offer to their customers (and as a result will eventually sunset their own home-made solutions that they built over the years - because there can be only one). Everyone is building more services and providing more solutions, to bring in more customers, increase their revenue.

Story is done. Nothing to see here. Next shiny thing please..

At the moment, Kubernetes uses docker as the underlying container engine. I think that the Kubernetes community understood that Docker as a container runtime (and I use this term specifically) was the ultimate solution to get a product out of the gate as soon as possible. They also (wisely) understood quite early on they needed to have the option of switching out that container runtime - and allowing the consumers of Kubernetes to make a choice.

The Open Container Initiative - brought with it the Runtime Spec - which opened the door to allow us all to use something else besides docker as the runtime. And they are growing - steadily. Docker is no longer the only runtime that is being used. Their is a growth in the community - that are slowly sharing the knowledge of how use something else besides Docker. Kelsey Hightower - has updated his Kubernetes the hard way (amazing work - honestly) over the years from CRI-O to containerd to gvisor. All the cool kids on the block are no longer using docker as the underlying runtime. There are many other options out there today clearcontainers, katacontainers and the list is continuously growing.

Most people (including myself) do not have enough knowledge and expertise of how to swap out the runtime to what ever they would like and usually just go with the default out of the box. When people understand that they can easily make the choice to swap out the container runtime, and the knowledge is out there and easily and readily available, I do not think there is any reason for us to user docker any more and therefore Docker as a technology and as a company will slowly vanish. The other container runtimes that are coming out will be faster, more secure, smarter, feature rich (some of them already are) compared to what Docker has to offer. If you have a better, smarter, more secure product - why would people continue to use technology that no longer suits their ever increasing needs?

For Docker - to avert this outcome - I would advise to invest as much energy as possible - into creating the best of breed runtime for any workload - so that docker remains the de-facto standard that everyone uses. The problem with this statement - is that there no money in a container runtime. Docker never made money on their runtime, they looked for their revenue on the enterprise features above and on top the container runtime. How they are going to solve this problem - is beyond me and the scope of this post.

The docker community has been steadily declining, the popularity of the events has been declining, the number of new features, announcements - is on the decline and has been on the decline for the past year or two.

Someone told me a while back - that speaking bad about things or giving bad news is usually very easy. We can easily say that this is wrong, this is no useful, this should change. But without providing a positive twist on something - you become the “doom and gloom”. The “grim reaper”. Don’t be that person.

I would like to heed their advice, and with that add something about - what that means for you today. You should start investing in understanding how these other runtimes can help you, where they fit, increase your knowledge and expertise - so that you can prepare for this and not be surprised when everyone else stops using docker and you find yourself having to rush into adapting all your infrastructure. I think it is inevitable.

That was the post I wanted to write 8 months ago...

What triggered me to finish this post today was a post from Scott Mccarty - about the upcoming RHEL 8 beta - Enterprise Linux 8 Beta: A new set of container tools - and my tweet that followed

Lo and behold - no more docker package available in RHEL 8.
If you’re a container veteran, you may have developed a habit of tailoring your systems by installing the “docker” package. On your brand new RHEL 8 Beta system, the first thing you’ll likely do is go to your old friend yum. You’ll try to install the docker package, but to no avail. If you are crafty, next, you’ll search and find this package:
podman-docker.noarch : "package to Emulate Docker CLI using podman."
What is this Podman we speak of? The docker package is replaced by the Container Tools module, which consists of Podman, Buildah, Skopeo and several other tidbits. There are a lot of new names packed into that sentence so, let’s break them down.









(Source - Tutorial - Doug Tidwell (https://youtu.be/bJDI_QuXeCE)

I think a picture is worth more than a thousand words..

Please feel free to share this post and share your feedback with me on Twitter (@maishsk)

2019-02-22

Separate VPC's can do More Harm Than Good

I have come across this a number of times of the past couple of months. Environments that were born in the datacenter, have grown in the datacenter - in short people who are used to certain (shall we say - ‘legacy’) deployments, and they they are in the midst of an attempt to mirror the same architecture when moving to the cloud.

I remember in my old days that our server farm had a separate network segment (sometimes even more than one) when I was using physical servers, (while I write this - I actually think it has been about 4 years since I actually touched a physical server, or plugged a cable/disk/device into a physical server) for our Domain controllers, Applications servers, and users had their own network segments that were dedicated only to laptops and desktops.

In the physical/on-prem world - this made sense - at the time - because what usually happened was the dedicated networking team that managed your infrastructure used access lists on the physical network switches to control which networks could go where.

Fast forward to the cloud.

There are people which equate VPC’s with Networks (even though it makes more sense to equate subnets to networks - but that is besides the point) - and think that segregating different kinds of workloads into multiple VPC’s will give you better security.

Let me give you a real scenario that I was presented with not too long ago (details of course have been changed to protect the innocent … )

A three tier application. Database, Application and a frontend. And the requirement that was laid down from the security team was that each of the layers must reside in the their own VPC. Think about that for a minute. Three VPC’s that would be peered to ensure connectivity between them (because of course the 2/3 layers needed to communicate with each other - Database - application and application to frontend). When I asked what was the reason for separating the three different layers in that way, the answer was, “Security. If for example one of the layer was compromised - it would be much harder to make a lateral move to another VPC and compromise the rest.”



So what is lateral movement? I know that there is no such a thing as a 100% secure environment. There will always be hackers, there will always be ways around any counter measures we try and put in place, and we can only protect against what we know and not against what we do not. The concept of lateral movement is one, of compromising a credential on one system and with that credential moving to another system. For example - compromising a Domain admin credential on an employees laptop - and with that credential moving into an elevated system (for example a domain controller) and compromising the system even further.

So how would this work out in the scenario above. If someone would compromise the frontend - the only thing they would be able to connect to would be the application layer - the frontend - does not have any direct interaction with the database layer at all, do your data would be safe. There would be a peering connection between the Frontend VPC an the Application VPC - with the appropriate routing in place to allow traffic flow between the relevant instances, and another peer between the Application VPC and the Database VPC - with the appropriate routing in place as well.

What they did not understand - is that if the application layer was compromised - then that layer does have direct connectivity with the data layer - and therefore could access all the data.

Segregating the layers into different VPC’s would not really help here.

And honestly - this is a risk that you take - which is why the attack surface you have - exposed on your frontend - should be as small as possible - and secure as possible.

But I came back to the infosec team and told them - what if I would provide the same security and segregation that you were trying to achieve but without the need of separate VPC’s ?

I would create a Single VPC - with three subnets and three security groups, Frontend, Application and Database. Instances in the frontend security group would only be allowed to communicate with the instances in the application security group on a specific port (and vice-versa) and the instances in the application security group would only be allow to communicate with the instances in the database security group (and vice-versa).


The traffic would be locked down to the specific flow of traffic and instances would not be able to communicate out of their security boundary.




As a side note - this could have also been accomplished by configuring very specific routes between the instances that needed to communicate between the VPC’s, but it does not scale to an environment larger than a handful of instances. Either you need to ensure that the IP addresses in a manual fashion, or keep on adding multiple routes in the route tables.

It goes without saying that if someone managed to compromise the frontend, and somehow managed through the application port to gain control into the application layer - they could gain access (in theory) to the data in the data layer.

Which is exactly what happened in the same scenario with 3 separate VPC’s. No less secure - no more.

But what changed??

The operational overhead of maintaining 3 VPC’s for no specific reason was removed.

This includes:

  • VPC Peering (which has a limit)
  • Route tables (which has a limit)
  • Cost reduction

I could even take this a bit further and say I do not even need different subnets for this purpose - I could actually even put all the instances in a single subnet and use the same mechanism of security groups to lock down the communication. Which is true. And in an ideal world - I probably would have done so - but in this case - it was a bit too revolutionary to already have made the step of going to a single VPC - and to go to a single subnet - was pushing the limit - maybe just a bit too far. Sometimes you need to take small victories and rejoice and not go in for the jugular.

I would opt into option of using separate VPC’s in some cases such as:

  • Different owners or accounts where you cannot ensure the security of one of the sides.
  • When they are completely different systems - such as a CI system and production instances
  • A number of other different scenarios

The bottom line of this post is - traditional datacenter architecture - does not have to be cloned into your cloud. There are cases where it does make sense - but there are cases where you can use cloud-native security measures - which will simplify your deployments immensely and allow you to concentrate as always on the most important thing. Bringing value to your customers - and not investing your time into the management and maintenance of the underlying infrastructure.


Please feel free contact me on Twitter (@maishsk) if you have any thoughts or comments.

2019-01-30

Empires are not built in a day (but they also do not last forever)

I am currently on vacation in Rome (my first time) and during this trip I came to a number of realizations that I would like to share with you.

I went to the Colosseum today - and I have to say I was in awe. The structure is magnificent (even if the remains are only part of of the original structure in all its glory). As I progressed throughout the day - I came to the following realizations.

(.. and of course how they tie into our tech world today)

Acquire vs. Train


Throughout ancient history - all the great empires (or what were once considered as such) were barbarians. They left legacies that remain to this day - but none of them were earned honestly. Most of the great wonders of the worlds - from the pyramids to the Colosseum to the great wall of China - these were all built with slave labor. The Romans conquered the world, enslaved almost every country they touched - used them to build an empire. I think it is safe to say this how the world used to work. Today, this would not be acceptable. Slavery and taking advantage of the other is not correct. 

The knowledge was there, the brains were there, but they needed working hands get the shit done. 
That is why people outsource development resources to places where labor is cheap (India for example) but leave the brains at home and only let the 'workers' churn out the hard stuff. 

There are several problems with this - and we are seeing this today in many walks of life. Some companies understand that even though the labor is cheaper, the quality and speed with which the work they wished to complete - is not what they expect. In the olden days would be able to terrorize your slaves into working to their deaths to provide what you want. This happened in ancient Egypt, in ancient Rome, pretty much everywhere. But that does not and cannot happen today. So we do one of two things. Instead of working the people to their death we provide incentives to produce more, be it higher salaries, better conditions, bonuses - hoping that this will encourage (or should I rather say force) people to work harder. The other option is - we compromise on quality - or on delivery times - which either pisses off our customers because we are late, or pisses them off - because the product is not as good as we promised.

It is obvious though - that the easiest way for us to produce - is not by training the talent from the ground up - but rather let someone else invest that time and effort - and when we have the opportunity, swoop in (in the olden days conquer) and reap the benefits of someone else's work. 

In today's world we see this with most big companies acquiring smaller ones. Growth by Acquisition. Cisco has built its empire over the years in this way. You can't build an amazing Wireless product - buy one. VMware the same. You can build a great Kubernetes offering - buy one

This is the way business works. Sometimes these mergers work and make the company better and sometimes they fail - dismally. Sometimes the talent gets incorporated but that is not always the case. 
It will all depend on how much you want to invest in the knowledge you acquired, and how much you become one with those people that bring that knowledge to the table.

True belief stays eternal


Religion is funny thing. I think I can say there is really only one religion that has stayed with us from the beginning and that is Judaism. Christianity became a well known religion - somewhere around the 4th century. Islam - somewhere in the 7th century. All the ancient kingdoms, rulers, empires, no matter how great they were, how much of the world they conquered (or tried to) - they no longer exist. The only true thing that people will cling to is an idea, a belief. Something that is emotional.

The Persians built an empire - it is no more.
The Egyptians , the Greeks, the Romans, the Ottoman empire, the list goes on and on and on - all gone. 

In our technological world today, it is hard to call anything eternal. Computers have only been around for less than 100 years. But even with its young age there are already religions forming around technology and it use..
  • vim vs emacs
  • Windows vs Mac
  • Windows vs Linux
  • Closed source vs open source
It is very hard to convert someone from one religion to another, sometimes with works some severe, and more severe, and sometimes less severe persuasion but there are cases where people will change their mind.

I am of the conviction that if what you believe in - is something that is connected to a deep emotion, something that is personal, it is something that will stay with you forever.

Technology - is still in its infancy - we might not realize it - and the rate at which things change is grower faster and faster as we go along.

I think I got a bit lost in the journey and lost sight of the end goal here - so let me get to the point.

Emotion, making it personal, and connecting with what you do - is something that will always stay with you. The technology you invest in, your day-to-day job, the tools you use - they will evolve and change - they are not eternal.

You are not a Java guy. You are not a kubernetes girl. You are not a X.

You are a person that learns, a person that adapts. Connect to your goal with emotion and this will allow you to succeed.

That is who you should be!

(Also published on Linkedin)

2019-01-11

The Year 2018 in review

I don't always do these kind of posts but 2018 was a substantial year for me that warrants a short summary.

I released the AWS Powershell Container - gauging by the number of pulls - I guess that is was not that useful.. :)

I completed my 5th AWS Certification. The post was also translated into Hebrew as well.

I presented a session at the DevOps Israel conference



I left Cisco (NDS) after 13 years and started a new position at CyberArk.

I became a lot more involved in the Israel Cloud community (for example Encounters in the Cloud - Interview).

I went to re:Invent again this year - and it my posts Keeping Kosher at re:Invent 2018 and How I Get the Most Out of #AWS re:Invent 2018 (Hebrew version) were very useful not only to me - but from what I heard - to others as well.

I was a guest on the Datanauts podcast - Datanauts 143: Getting To Day 2 Cloud.  I found out - that this episode was the most popular episode of the year 2018 on the show. Respect!


I presented an Ignite (in Hebrew) at DevOpsDaysTLV



I also presented a session at the AWS Community Tel Aviv 2018



And last but not least - I released the AWS Visio Stencils

All in all - it was a good year.

One thing that I neglected (badly!!), was my writing the rest of The Cloud Walkabout - which is something that I will make the most effort to rectify this year.

Looking forward to 2019... Upward and onward!!


2019-01-04

I was not expecting this at re:Invent

There was a lot to absorb during the jam packed week in Las Vegas but there were a number of things that I was truly surprised about during the conference..

It was clear that AWS is going after the Enterprise market and are accommodating the on-prem / legacy / old-school way of thinking. This is the first re:Invent that you could really feel the change.

Here are a few of them:

AWS Outposts

AWS Well Architected
Lake Formation

Security Hub

Control Tower

FSx


Next was containers or the lack of containers actually. There were no significant container announcements. ECS and EKS - were not mentioned once during the keynote. No new functionality, no new features. For the product that was probably the most demanded release that everyone wanted last year at re:Invent - this year - it was crickets all the way down. I was thinking that AWS was saving some glory and glitters for the Kubecon conference the week after - but all that really came out of there was the Containers Roadmap (which is actually amazing - because AWS never disclose what their roadmap is - at least not publicly. I suppose it is expected of them as their keeping up the image of Opensource contribution and championship).

And the last shocker was the fact that inbound traffic to S3 is now going to cost you money.. 

Wait, What? You are now charged for uploads to S3????
Well that is not entirely true. Traditionally - you do not pay for incoming traffic into S3 - it says that black on white.  

s3 Pricing



So no you are not charged for direct uploads to S3. But if you do it through another service that acts as a proxy to S3 - then that's different.

Storage Gateway was one such a service.

Storage Gateway

Here you are allowed 100GB for free each month and capped at a maximum of $125 / month. For a company that transfers hundreds and thousands of TB a month - the $125 is chump change which essentially makes it pretty much free.

And then came AWS Transfer for SFTP and the change that no-one really noticed.

SFTP Pricing
Whoa!! Not only are you being charged for 4x the amount of any other service,  you are not capped at a maximum monthly spend, and you get no free monthly uploads either.

You use it - you pay (and pay for it you will).

Next up was DataSync

Datasync Pricing







Again - same new price of $0.04/GB for transfer traffic into S3.

Pricing example

Their pricing example as well
If you were to do the exact same thing - but with regular S3 upload. 
If you perform a one-time migration of 50 TB of 16 MB files into Amazon S3 in US East (Ohio), it costs you the following to use S3 cli
(50 TB copied into S3 * 1024 GB * $0.00 / GB) + (1 S3 LIST request * $0.005 / 1000) + (50 TB / 16 MB S3 PUT requests * $0.005 / 1000)
= $0 + $0 + $16.38
= $16.38
That is one heck of a difference. Now I have not tested the difference in speed, or throughput you can get from Datasync - I am sure there is a difference in the data transfer speeds.

But for me this is troubling. The whole bloody world uses S3 (granted most of the traffic is going from S3 out of AWS). Are AWS planning a change in their pricing model? Even if it is $0.04/GB - this would be a huge channel of additional revenue for them. Something to ponder on.

The pricing model that is now attached to S3 uploads seems strange to me - especially if you are receiving the exact same thing through another route for free. If it would have been network traffic through the service - I would have easily been able to accept.
And last but not least, Werner Vogels finished his keynote on time this year. Well done and thank you for assisting in the effort of improving our experience at re:Invent this year.

Thoughts? Comments? 
Feel free to reach out to me on Twitter (@maishsk)