2018-09-02

Replacing the AWS ELB - The Design

This is Part 3 in the Replacing the AWS ELB series.
  1. Replacing the AWS ELB - The Problem
  2. Replacing the AWS ELB - The Challenges
    1. Replacing the AWS ELB - The Design (this post)
    2. Replacing the AWS ELB - The Network Deep Dive
    3. Replacing the AWS ELB - Automation
    4. Replacing the AWS ELB - Final Thoughts

    So how do you go about using an IP address in a VPC and allow it to jump between availability zones?

    The solution to this problem was mentioned briefly in a slide in a re:invent session - which for the life of me I could not find (when I do I will post the link).

    The idea is to create an "overlay" network within the VPC - which allows you to manage IP addresses even though they don't really exist in the VPC.

    A simple diagram of such a solution would look something like this:

    standard_haproxy

    Each instance would be configured with an additional virtual interface - with an IP address that was not part of the CIDR block of the VPC - that way it would not be a problem to move it from one subnet to another.

    If the IP address does not actually exist inside the VPC - how do you get traffic to go to it?

    That is actually a simple one to solve - by creating a specific route on each of the subnets - that routes traffic to a specific ENI (yes it is possible).

    add_route

    The process would be something like this:

    start

    An instance will try to access the virtual IP - it will go to the Route table on the Subnet and and because of the specific entry - it will be routed to a specific instance.

    The last piece of the puzzle is how do you get the route to jump from one instance to the other instance of haproxy, this would be the initial state.

    initial

    haproxya fails or the AZ goes down

    haproxya_fail
    haproxyb recognizes this failure
    recognize_failure

    And then makes a call to the AWS API to move the route to a different ENI located on haproxyb

    move_to_haproxyb

    In the next post - we will go into a bit more detail on how the network is actually built and how the failover works.