One of our clients once reached out to us for help with deep inspection of traffic that was reaching its web servers, mostly from the compliance perspective.

The client had, by then, already run multiple kinds of workloads that were mostly web or API interfaces exposed over the internet.

Now the internet is an exciting place to host one’s services, as well as to reach out and meet a wide range of client needs. It is also a place where intelligent minds get to deliberately exploit vulnerabilities of applications and services. Preventing these harmful attacks, therefore, is a necessity today. So every request that reaches your applications over the internet must be inspected for its ‘intentions’ – is it a legitimate one, or is it attempting to exploit  vulnerabilities that are yet to be fixed.

While legitimate requests are allowed to go through, the ones that are exploitative in nature must be blocked. This entails real-time inspection, but with little or insignificant delays, to ensure the user experience remains unaffected.

The idea proposed to correct this sort of a situation was that of a Reverse Proxy using a highly available load balancer. Given that the client had his deployment on AWS Cloud, the natural solution was to use the AWS Elastic Load Balancer (ELB) and host the inspection service behind it. Doing so  would then relay the request and responses to and from the application services.

Now, I won’t get into the details of setting up the inspection service -- that we will get to some other time. However, I will touch upon one of the aspects of the integration with AWS ELB required to check of a key line item in the compliance checklist – that of identifying the source IP address from where the request is originating.

Let’s quickly summarize the setup so far:

  1. Requests originate from devices on the internet (laptops/ machines/  third party services)
  2. These requests reach AWS ELB
  3. ELB relays the request to inspection services
  4. Inspection service relays the requests to application services
  5. The response from application makes its way back through inspection service, ELB and eventually, to the device that made the request

The connections are stateful using TCP/ IP, and the routing of established connections is managed at each hop. This means that every time a packet of data makes a hop, the source and destination details of each connection are stored at the hop, while the device’s source address is written onto the packet to enable the return traffic.

In English you say?

The data packet(s) originating from the device making the request will have two key pieces of information – the IP address of the device making the request a.k.a source IP, and the IP address of the ELB a.k.a destination IP address. The destination IP address field of the packet is used by network devices making up the internet to send the packet(s) along to the ELB associated with the IP address. At this stage, the filed source IP address will contain the IP address of the requesting device.

After the packet is processed by the ELB’s logic on load balancing, the packet will then be addressed to one of the hosts running the inspection service. The source IP address filed will now contain the ELB’s IP address – this is to enable response packet requests to go back to the same device that sent the request to the inspection service.

Therein lies the issue, where the IP address of the original device making the request is held at the load balancer for the duration of the session, and then is lost to logging/ auditing services. Thankfully, this is by design and is sometimes useful in enhancing security where the requesting device’s identity has to be abstracted. That said, this is not one of those cases. The same scenario recurs when the inspection service relays the request to the application service.               

Here, we will look at sending the source IP address through the ELB to the inspection service. The same can be achieved on the inspection service, but we are not going to get into that. I am sure you will be able to figure that out on your own once you see how it’s done using the ELB.

Since the abstraction of the source IP is by design, the original requester’s IP address must be relayed as additional information through the ELB, i.e. in addition to the ELB’s source IP address. AWS ELB provides for only sending this additional piece of information in the request header originating from the ELB (see ELB x-forward-for/ proxy protocol). It’s the job of the service at the request receiving end (inspection service in this case) to make sense of the additional data that is coming its way and extract the required information.

Once the service uses the right modules (like myfixip with apache in this case), apache can read the format in which the ELB is forwarding the original requester’s IP address over proxy headers. This can then be used by the apache to log the IPs. Alternatively, the application service can pull that information to run its login against it.