2018-08-29

Replacing the AWS ELB - The Problem

This topic has been long overdue.

This will be a series of posts on how you can replace the AWS ELB’s inside your VPC’s with a self managed load balancing solution. This will be too long for a single blog post so I decided it was best to split it up into parts.

  1. Replacing the AWS ELB - The Problem (this post)
  2. Replacing the AWS ELB - The Challenges
    1. Replacing the AWS ELB - The Design
    2. Replacing the AWS ELB - The Network Deep Dive
    3. Replacing the AWS ELB - Automation
    4. Replacing the AWS ELB - Final Thoughts

    Let me start at the beginning.

    The product I was working with had a significant number of components that were communicating with each other. A long while back the product had decided to front all communication between components with a load balancer - for a good number of reasons such as:

    • scaling
    • high availability
    • smart routing

    Here is a basic view of what the communication between component1 to component2 would like (and just to clarify - there were just over 50 components that were in this solution - not just one or two)


    simple_diagram


    Simple load balancing example - something that you can see anywhere. The load balancers - were HAproxy.

    There was some additional logic going on inside the traffic between the components which was based on HTTP headers which allowed us to route some to certain instances and versions of the application (you can think of it as blue/green or canary routing


    A simple visualization of this would look something like this.

    routing 


    The team had already deployed this product and now it was time to move over to AWS.

    The first part of the discovery was to identify how we could accomplish as much as possible of the solution while using the services that were provided by AWS - instead of deploying our own. one of the items for  discussion that came was of course the Load balancers we were using - and if they could be replaced with AWS ELB's.

    Here were the list of requirements (and the current solution that was used met all the requirements):

    1. Must be able to serve as a load balancer
      • Define frontends
      • Define backends
    2. Must not be a single point of failure
    3. Provisioning will have no manual interaction
    4. Must be able to route traffic based on specific custom HTTP headers

    AWS met all the requirements except for #4.

    There are options to route traffic based on HTTP headers in the AWS Application Load Balancer but they are limited (to say the least), you can only used a Host header or a path in the URL.


    hostname_route     path_route


    This was not an option, the engineering teams were not going to refactor the applications just to adapt to the AWS ELB. This caused me to go back to the drawing board an see how we could still use the know HAproxy solution inside of AWS - despite the well known problems.

    More about those in the next post.