F5 – Bleeding Active Connections

Scenario:

A Virtual Server is load balancing connections to a pool with 2 pool members. During maintenance window, one of the two pool members is disabled and maintenance is completed followed by the other pool member.

However, as the users make continuous API calls every 5 seconds, the existing TCP connection never bleeds out. Even after waiting for 24 hours, connections still exist on the disabled pool member.

Solution:

By default, F5 makes load balancing decision when the 1st HTTP request within a TCP connection is received. Subsequent HTTP request within the TCP connection are sent to the same pool member as the very 1st HTTP request.

By enabling OneConnect profile with a /32 netmask (255.255.255.255), we were able to force the F5 to make load balancing decision for every HTTP request instead of its default behavior.

The OneConnect profile used along with disabled or forced-offline setting will move the connection from the failed pool member to the active pool member.

Reference Link.

F5 OneConnect

F5 has a trademark feature called OneConnect that leverages HTTP 1.1 Keepalive. In this article, I will try to explain the functionality of OneConnect, underlying technology and its usage requirements.

There are 2 main reasons that I have utilized OneConnect profile:

  1. Efficient reuse of server side connections (F5-Servers)
  2. L7 Content Switching i.e., when the F5 is configured to make load balancing or persistence decision based on information available in L5 to L7.

Efficient TCP Connection Reuse:

Every time a new connection is set up on the server, the server has to allocate resources.

What Resources:

Resources like State/buffer memory at the Kernel level & memory per thread as each connection consumes web server threads. There is also an impact on the CPU of the server as the server needs to keep track of the web threads and connections.

The consumption of resources outlined earlier won’t affect a small to medium scale load balanced web-server infrastructure. However, as the traffic flow increases, the servers can take quite a big hit. At this point, the option would be to add more servers which can increase your CapEx and of course, OpEx will rise for administrative reasons.

OneConnect, if used properly will provide you with a method to efficiently utilize the existing infrastructure without the requirement for adding more physical servers.  This is done by leveraging existing server side connections using HTTP 1.1 Keepalive.

Lets say, client IP 1.1.1.1 is initiating a connection to the VIP 50.50.50.50 which then gets load balanced to the server 10.10.10.10. Within the TCP connection, the client will utilize multiple HTTP Requests to obtain the right content from the server (HTTP 1.1 Keepalive). After the transaction has been completed, the client closes the client side connection (Client – F5). However, the F5 retains the server side connection (F5-Server). If a new client (1.1.1.2) initiates a connection within a certain timeout interval, the F5 will re-use the server side connection that was retained for the 1.1.1.1 connection. As you can see, the server side connection that was created when 1.1.1.1 made the initial request was used when the new client 1.1.1.2 made the request.

In this particular simple example, 2 client side connections were served with only 1 server side connection. Assuming you can achieve the same scale ratio, 100K client side connections require only 50K server side connections. As the client side connection increases, the OneConnect profile delivers better results.

One thing to note is that from the server’s perspective, HTTP Requests initiated by 1.1.1.2 is still assumed to be over the connection initiated by 1.1.1.1 i.e., the client IP at the server level no longer provides the right information about the true client IP. In order to overcome this, “X-Forwarded-For” header insertion would have to be utilized to insert the right “True Client IP” at the server level. It is essential that the server logs and the application looks for the client IP in the “X-Forwarded-For” header and not the Client IP field.

How does the F5 know which server-side connection to reuse:

In order to understand the connection-reuse algorithm, it is essential to understand the OneConnect Profile Settings.

OneConnect Profile Settings:

  • Source Mask – Network mask that is applied to the incoming client IP in order to identify the server side connection that can be reused. The effect of Source Mask in OneConnect is well explained here.
  • Maximum Size – Max number of server-side connections that is retained in the connection pool for reuse
  • Maximum Age – Max age up to which a server side connection is retained in the connection pool for reuse
  • Maximum Reuse – Max number of HTTP Requests that can be sent over a server side TCP connection
  • Idle Timeout Override – Max time up to which an idle server side connection is kept open for reuse.

A server side connection is retained in the “connection pool” for reuse as long as it satisfies the Max Age, Reuse & Idle Timeout conditions. The size of this “connection pool” is defined within Max Size.

Based on experience, I would recommend not altering any of the default settings of the OneConnect profile other than the Source Mask.

L7 Content Switching:

By default, F5 performs load balancing only once for each TCP connection. This is performed when the very first HTTP Request within a TCP Connection is received by the F5 LTM. Subsequent HTTP Requests in the same TCP connection will be sent to the same server that handled the 1st HTTP Request.

Consider a scenario in which there are multiple clients originating connections from behind a proxy that multiplexes individual client TCP connection into a single TCP connection and sends requests to the F5’s VIP via this single TCP Connection. In this case, after the load balancing decision is made for the very first HTTP Request from the 1st client, subsequent HTTP requests from other clients will be sent to the same server, regardless of the L7 information. If you utilize the HTTP header information for load balancing or persistence, it will lead to undesirable behavior.

An example:

Lets say that there are 5 clients initiating 5 TCP connections and this gets multiplexed into 1 TCP connection between the Proxy and the F5.

Lets say that the load balancing/persistence decision is made based on the cookie persistence (L7)

When the 1st client sends its HTTP Request, it will be load balanced to a specific pool and cookie will be inserted in the HTTP Response.

If there is a subsequent HTTP Request from any  of the other 4 clients utilizing the same TCP connection between the proxy and the F5, they will be sent to the same server that handled the 1st HTTP Request, even if the cookie information provided by the subsequent HTTP Requests are different – lets say that the cookie was set manually and it is pointing to a different server.

This happens because the F5 will stop parsing the HTTP requests in the same TCP connection after the load balancing decision has been made for the very 1st HTTP request – default behavior. When we enable OneConnect, we are telling the F5 to continue checking the HTTP Requests and not to stop checking after the 1st HTTP Request.

This article explains the default behavior of the F5 device and how it affects L7 persistence like Cookie & Universal (UIE).

This article explains the reason for using OneConnect when load has to be balanced to multiple pools based on L7 information.

In short, whenever you require load balancing or persistence based on L7 information, utilize OneConnect.

A good understanding of OneConnect requires a good grasp of HTTP 1.1 Keepalive, Pipelining, OneConnect Profile Settings & the default F5 load balancing behavior.

I will try to provide a graphical explanation to OneConnect in the coming days.

More information about OneConnect is provided here.