Replacing the AWS ELB - The Network Deep Dive

This is Part 4 in the Replacing the AWS ELB series.

  • Replacing the AWS ELB - The Problem
  • Replacing the AWS ELB - The Challenges
  • Replacing the AWS ELB - The Design
  • Replacing the AWS ELB - The Network Deep Dive  (this post)
  • Replacing the AWS ELB - Automation
  • Replacing the AWS ELB - Final Thoughts

  • Why does this whole thing with the network actually work? Networking in AWS is not that complicated - (sometimes it can be - but it is usually pretty simple) so why do you need to add in an additional IP address into the loop - and one that is not even really part of he VPC?

    To answer that question - we need to understand the basic construct of the route table in an AWS VPC. Think of the route table as a road sign - which tells you where you should go .

    Maybe not such a clear sign after all
    (Source: https://www.flickr.com/photos/nnecapa)

    Here is what a standard route table (RT) would look like


    The first line says that all traffic that is part of your VPC - stays local - i.e. it is routed in your VPC, and the second line says that all other traffic that does not belong in the VPC - will be sent another device (in this case a NAT Gateway).

    You are the master of your RT - which means you can route traffic destined for any address you would like - to any destination you would like. Of course - you cannot have duplicate entries in the RT or you will receive an error.


    And you cannot have a smaller subset the traffic routed to a different location - if a larger route already exists.


    But otherwise you can really do what you would like.
    So defining a additional interface on an instance is something that is straight forward.

    For example on a Centos/RHEL instance you create a new file in /etc/sysconfig/network-scripts/

    This will create a second interface on your instance.


    Now of course the only entity in the subnet that knows that the IP exists on the network - except the instance itself.
    That is why you can assign the same IP address to more than a single instance.


    Transferring the VIP to another instance

    In the previous post the last graphic showed in the case of a failure - haproxyb would send an API request that would transfer the route to the new instance.

    keepalived has the option to run a script that execute when the it's pair fails - it is called a notify

    vrrp_instance haproxy {
      notify /etc/keepalived/master.sh

    That is a regular bash script - and that bash script - can do whatever you would like, luckily that allows you to manipulate the routes through the AWS cli.

    aws ec2 replace-route --route-table-id <ROUTE_TABLE> --destination-cidr-block <CIDR_BLOCK> --instance-id <INSTANCE_ID>

    The parameters you would need to know are:

    • The ID of the route-table entry you need to change
    • The network of that you want to change
    • The ID of the instance that it should be

    Now of course there are a lot of moving parts that need to come into place for all of this to work - and doing it manually would be a nightmare - especially at scale - that is why automation is crucial.

    In the next post - I will explain how you can achieve this with a set of Ansible playbooks.

    Replacing the AWS ELB - The Challenges

    This is Part 2 in the Replacing the AWS ELB series.
    1. Replacing the AWS ELB - The Problem
    2. Replacing the AWS ELB - The Challenges (this post)
      1. Replacing the AWS ELB - The Design
      2. Replacing the AWS ELB - The Network Deep Dive
      3. Replacing the AWS ELB - Automation
      4. Replacing the AWS ELB - Final Thoughts

      Now that you know the history from the previous post - I would like to dive into the challenges that I faced during the design process and how they were solved.

      High Availability

      One of the critical requirements was "Must not be a single point of failure" - which means whatever solution that we went with - must have some kind of High availability.

      Deploying a highly available haproxy cluster (well it is a master/slave deployment - it cannot really scale) is not the that hard of a task to accomplish.

      Here is a simple diagram to explain what is going on.


      Two instances, each one has the haproxy software installed - and they each have their own IP address.

      A virtual IP is configured for the cluster and and with keepalived we maintain the state between the two instances. Each of them is configured with a priority (to determine which one of them is the master/slave) and there is a heartbeat between them vrrp is used to maintain a virtual a virtual router (or interface between them). If the master goes down - then the slave will take over. When the master comes back up - then the slave will relinquish control back to the master.
      This works - flawlessly.

      Both haproxy's have the same configuration - so if something falls over - then the second instance can (almost) instantly start serving traffic.

      Problem #1 - VRRP

        VRRP uses multicast - https://serverfault.com/questions/842357/keepalived-sends-both-unicast-and-multicast-vrrp-advertisements - but that was relatively simple to overcome - you can configure keepalived to use unicast - so that was one problem solved.

        Problem #2 - Additional IP address

        In order for this solution to work - we need an additional IP address - the VIP. How do you get an additional IP address in AWS - well that is well documented here - https://aws.amazon.com/premiumsupport/knowledge-center/secondary-private-ip-address/. Problem solved.

        Problem #3 - High Availability

        So we have the option to attach an additional ENI to the cluster - which would allow us to achieve something similar to what we have above - but this introduced a bigger problem.

        All of this would only work in a single Availability Zone - which means that the AZ was a single point of failure - and therefore was in violation of requirement #2 - which would not work.

        As it states clearly in the AWS documentation a subnet cannot span across multiple AZ's


        Which means this will not work..


        Let me explain why not.

        A network cannot span multiple AZ's. That means if we want the solution deployed in multiple AZ's - then it needs to be deployed across multiple subnets ( and each in their on AZ. The idea of taking a additional ENI from one of the subnets and using it as the VIP - will work only in a single AZ - because you cannot move the ENI from one subnet in AZ1 - to another subnet in AZ2.

        This means that the solution of having a VIP in one of the subnets would not work.

        Another solution would have to explored - because having both haproxy nodes in a single AZ - was more or less the same as having a single node not exactly the same but still subject to a complete outage if the the entire AZ would go down).

        Problem #4 - Creating a VIP and allow it to traverse AZ's

        One of the biggest problems that I had to tackle was how do I get an IP address to traverse Availability zones?

        The way this was done can be found in the next post.

        Replacing the AWS ELB - The Problem

        This topic has been long overdue.

        This will be a series of posts on how you can replace the AWS ELB’s inside your VPC’s with a self managed load balancing solution. This will be too long for a single blog post so I decided it was best to split it up into parts.

        1. Replacing the AWS ELB - The Problem (this post)
        2. Replacing the AWS ELB - The Challenges
          1. Replacing the AWS ELB - The Design
          2. Replacing the AWS ELB - The Network Deep Dive
          3. Replacing the AWS ELB - Automation
          4. Replacing the AWS ELB - Final Thoughts

          Let me start at the beginning.

          The product I was working with had a significant number of components that were communicating with each other. A long while back the product had decided to front all communication between components with a load balancer - for a good number of reasons such as:

          • scaling
          • high availability
          • smart routing

          Here is a basic view of what the communication between component1 to component2 would like (and just to clarify - there were just over 50 components that were in this solution - not just one or two)


          Simple load balancing example - something that you can see anywhere. The load balancers - were HAproxy.

          There was some additional logic going on inside the traffic between the components which was based on HTTP headers which allowed us to route some to certain instances and versions of the application (you can think of it as blue/green or canary routing

          A simple visualization of this would look something like this.


          The team had already deployed this product and now it was time to move over to AWS.

          The first part of the discovery was to identify how we could accomplish as much as possible of the solution while using the services that were provided by AWS - instead of deploying our own. one of the items for  discussion that came was of course the Load balancers we were using - and if they could be replaced with AWS ELB's.

          Here were the list of requirements (and the current solution that was used met all the requirements):

          1. Must be able to serve as a load balancer
            • Define frontends
            • Define backends
          2. Must not be a single point of failure
          3. Provisioning will have no manual interaction
          4. Must be able to route traffic based on specific custom HTTP headers

          AWS met all the requirements except for #4.

          There are options to route traffic based on HTTP headers in the AWS Application Load Balancer but they are limited (to say the least), you can only used a Host header or a path in the URL.

          hostname_route     path_route

          This was not an option, the engineering teams were not going to refactor the applications just to adapt to the AWS ELB. This caused me to go back to the drawing board an see how we could still use the know HAproxy solution inside of AWS - despite the well known problems.

          More about those in the next post.


          Scratching an itch with aws-vault-url

          I think that aws-vault is a really nice tool. It prevents you from saving your AWS credentials in plain text on your machines (which is always a good thing)

          Since I started using it – I found a number of difficulties along the way.

          1. aws-vault does not support aarch64 #261

            To solve this - I created my own binary - aws-vault on a Chromebook

          2. aws-vault only supports storing credentials when using a fully blown GUI. Here is a really good walkthrough how to get this working https://www.tastycidr.net/using-aws-vault-with-linux/

          3. aws-vault login will give you a URL with which you can paste into a browser and it will log you in automatically to the AWS console. My pet peeve with this was that it always brings you to the default console page.


            So I was thinking – why would I not be able to open up the specific console that I would like to access – such as S3 or EC2 – I mean come on … these are just different URLs that need to be opened in the same way.

          Now if I was a go developer – I would happily have contributed this back to the original project – but I am not. I am not really a developer at all. I can play with code – I can also create stuff – but I would not dare call myself someone who can write an application.

          So I wrote a small wrapper script to provide this functionality.

          Say hello to aws-vault-url – an easier way to open a direct conosle for a specific product.

          (This is in no way a robust tool – and if you would like to contribute and improve it – please feel free to submit a PR)

          Update – 22/08/2018

          So I did some thinking about this – and came to the conclusion that it makes no sense to maintain a separate tool – so I decided to take the leap and push myself to go into the code it self – so I sat for an hour or two last night, and extended the current functionality of aws-vault to accommodate this.

          Here is the PR - https://github.com/99designs/aws-vault/pull/278.

          Once this is merged – I suggest that you move over to the complete tool.


          A Triangle is Not a Circle & Some Things Don’t Fit in the Cloud

          Baby Blocks

          We all started off as babies, and I am sure that not many of you remember that one of the first toys you played with (and if you do not remember - then I am sure those of you with kids have probably done the same with your children) was a plastic container with different shapes on the lid and blocks that were made of different shapes.

          A triangle would only go into the triangle, a circle in the circle, a block in the block and so on.

          This is a basic skill that teaches us that no matter how hard we try, there are some things that just do not work. Things can only work in a certain way (of course coordination, patience and whole lot of other educational things).

          It is a skill that we acquire, it takes time, patience, everyone gets there in the end.

          And why am I blogging about this – you may ask?

          This analogy came up a few days ago in a discussion of a way to provide a highly available database in the cloud.

          And it got me thinking….

          There are certain things that are not meant to be deployed in a cloud environment because they were never meant to be there in the first place. The application needed an Oracle database and it was supposed to be deployed in a cloud environment.

          What is the default way to deploy Oracle in highly available configuration? Oracle RAC. There are a number of basic requirements (simplified) you need for Oracle RAC.

          1. Shared disk between the nodes.
            That will not work in a cloud environment.
            So we can try using dNFS – as the shared storage for the nodes – that might work..
            But then you have to make an NFS mount available to the nodes – in the cloud.
            So let’s deploy an NFS node as part of the solution.
            But then we have to make that NFS node highly available.
          2. Multicast between the nodes - that also does not work well in the cloud.
            So maybe create a networking environment in the cloud that will support multicast?
            Deploy a router appliance in the cloud.
            Now connect all the instances in the cloud into the router.
            But the router poses as a single point of failure.
            Make the router highly available.

          And if not Oracle RAC – then how about Data Guard – which does not require shared storage?

          But it has a steep licensing fee.
          And you have to find a way for managing the virtual IP address – that you not necessarily will have control over.
          But that can be overcome by deploying a VRRP solution with IP addresses that are manually managed.


          Trying to fit a triangle into a square – yes if you push hard enough (it will break the lid and fit).
          If you cry hard enough – Mom/Dad will come over and put it in for you.

          Or you come up with half-assbaked solution like the one below…


          Some things will not fit. Trying to make them fit creates even more (and sometimes even bigger) problems.

          In this case the solution should have been - change the code to use a NoSQL database that can be deployed easily and reliably in a cloud environment.

          As always your thoughts and comments are welcome.


          Saving a Few Shekels on your AWS bill

          I have a jumpbox that I use to access resources in the cloud – and I use it at work, only during work hours and only on workdays.

          There are usually 720 work hours in the month or 744 in months that have 31 days. Assuming that I want to run the instance for 12 hours a day and for 5 days a week. In order to calculate how many hours exactly – we will need an example.

          The month of August, 2018


          The work week in Israel is Sunday-Thursday (yeah – I know – we are special…).

          August would have 22 work days. Total number of hours in August (31*24 = 744). 220 working hours in the month (22 working days multiplied by 10 hours per day).

          The math is simple 220/744 – I only need the instance for 30% of the month – so why would I pay for all of it?

          744 hours * $0.0464 (for a t2.medium instance in us-east-2) = $34.5216 and if I was to only pay for the hours that I was actually using the instance that would be 220 * $0.0464 = $10.208. A third of the cost. Simple math.

          So there are multiple ways to do this – as a Lambda script, Cloud custodian – each of these work – very well and will work brilliantly at scale. For me it was a single machine and honestly I  could not be bothered to set up all the requirements to get everything working.

          Simple solution – use cron. I don’t pay for resource usage by hour in my corporate network (If someone does – then you have my sympathies..) so I set up a simple cron job to do this.

          To start up the instance:

          0 8 * * 0,1,2,3,4 start-jumpbox

          And to stop the instance at night

          0 18 * * 0,1,2,3,4 stop-jumpbox

          And what is the start/stop-jumpbox comand you might ask – really simple aws cli command

          aws ec2 start-instances –region <__REGION__>  --instance-ids <__INSTANCE_ID__>

          aws ec2 stop-instances –region <__REGION__>  --instance-ids <__INSTANCE_ID__>

          Of course in the background the correct credentials and access keys are set up on my linux box – not going to go into how to that here – AWS has enough documentation on that.

          The last thing that I needed to solve was the jumpbox has a public IP (obviously) and If I really wanted to save the money – I do not want to have pay for a static Elastic IP provisioned and sitting there idle for 70% of the month (because the instance is powered down

          After doing the calculation – it was chump change for my use case (524hrs * $0.005=$2.62) so maybe I should have not worried about it – but the resulted script is still useful.

          I wanted to use the allocated IP address that AWS provides to the instance at no cost. The problem with this is – every time you stop the instance the IP address is reclaimed by AWS and when you power it on – you will get a new one.

          Me being the lazy bum I am – I did not want to have to lookup up the IP each and every time so I went down the path of updating a DNS record on each reboot.

          Take the Public IP allocated to the instance and update a known FQDN that I would use on a regular basis.

          The script can be found here (pretty self explanatory)

          Now of course this is only a single instance – but if you are interested in saving money this is one of the considerations you think about looking to save. (of course this should be managed properly at scale a single cron job will not suffice…)

          For example – if you have a 1000 development machines that are not being used after working hours (and I know that not everything can be shut down after hours there are background jobs that run 24/7),  and they are not a measly t2.medium but rather an m4.large 

          1000 (instances) * 220 (work hours) * $0.1 (cost per hour) = $22,200

          1000 (instances) * 744 (hours in the month) * $0.1 (cost per hour) =  $74,400

          See how you just saved $50k a month on your AWS bill?

          You are welcome :)

          (If you want spend a some your well saved cash on my new book – The Cloud Walkabout – feel free).

          If you have any questions about the script / solution or just want to leave a comment – please feel free to do so below