Delayed ACK and Nagle’s Algorithm

In this article, I am taking a shot at trying to explain the interaction between Delayed ACK and Nagle’s algorithm and how this could add latency during TCP session that requires transmission of small packets.

MSS:

Maximum Segment Size or MSS denotes the data that is being sent in the “Segment” utilized in the TCP layer of the OSI model. Default MSS is 536 Bytes. Default MTU is 576 Bytes.

MTU = MSS + 20B (TCP Header) + 20B (IP Header)

TCP Transmission:

RFC 1122 talks about the conditions under which data can be transmitted in TCP implementations. Data is transmitted when any of the following 3 conditions is met.

1. Immediately, if a full MSS or more can be sent.

2. {[No unacknowledged data] && 
    [(PSH flag is set) || 
     (Buffered data > 1/2*(SND Window))]}

3. {(PSH flag set) && (Override timeout expired)}

1/2*(SND Window) is implementation dependent and can differ across Operating System and within different versions of Operating System. Override timeout is roughly 200ms. This value could change between OS too. ACK Number represents bytes and not packets.

Delayed ACK:

Delayed ACK helps in avoiding “Silly-Window-Syndrome” (SWS) at the Receiver. The Receiver will delay sending an ACK in response to data received when all the 3 conditions match.

1. When there are no 2 packets / 2*MSS received.

2. When the client has no data to send.

3. When the Delayed ACK timer has not yet expired.

In the UNIX World, 2*MSS has to be received by Receiver in order for it to send an ACK and in the Windows World, 2 packets of any size has to be received by Received in order for it to send an ACK.

Nagle’s Algorithm (RFC896):

The goal of Nagle’s algorithm is to lower the number of small packets exchanged during a TCP session. This helps in avoiding “Silly-Window-Syndrome” (SWS) at the Transmitter.

Nagles algorithm can be summarized as follows:

A. If there are unacknowledged data 
   (i.e., data in flight > 0 Bytes), 
   new data is buffered.

B. If data to be sent is less than MSS, it is 
   buffered till the data to be sent is 
   greater than or equal to MSS.

Problem:

Under the right conditions, the 1-3 points outlined under Delayed ACK and A-B points outlined in Nagle’s Algorithm will freeze the interaction between the sender and the receiver for the duration of timeout which is roughly 200ms. This is often seen in applications that rely on smaller packet sizes.

A Simple Scenario:

Sender is a client machine that updates the Receiver with information. Receiver could be some kind of a data warehouse which stores information on financial transactions. In this case, the Sender has data to send to the Receiver and the Receiver acknowledges the data received and does not transmit any data to the client in response other than a simple ACK.

During the course of a TCP session, Sender has just sent 500B of data to the Receiver and this matches condition A outlined in Nagle’s algorithm. The application at the Sender side moves 400B of data to the TCP stack. At this point, Sender has not yet received an ACK from the Receiver and because the next 400B of data meets the B condition outlined in Nagle’s algorithm.

Thus, the 400B of data will be buffered till either one of the Nagle’s condition is met:

A. ACK is received from Receiver for the 
   previously sent 500B of data.

B. Application sends the TCP stack more data 
   that will push the existing buffered data 
   (400B) more than MSS i.e., application needs 
   to send 136B or more to the TCP stack in 
   order to push the buffered data to or beyond
   the MSS limit.

On the Receiver side, the receiver will refrain from sending an ACK after receiving the first 500B of data because the 1-3 conditions outlined under Delayed ACK hasn’t been met.

1. Only 1 packet of 500B (less than MSS) has been 
   received.

2. Receiver does not have any data to transmit, 
   other than ACK.

3. Delayed ACK timer has not yet expired.

Sender keeps the 400B of data buffered. Receiver will not ACK the previously sent 500B of data. Effectively, there will be a communication freeze between the Sender and the Receiver till the timeout expires. Usually this timeout is 200ms in different OS implementations.

For further understanding, I would highly recommend this youtube video on this subject by Hansang Bae.

Ansible – The Why ?

What ?

Ansible is a simple IT automation tool.

Ansible exists as CLI & GUI. GUI is called the Ansible Tower and Ansible, Inc., which is owned by RedHat, officially supports this.

Controlling Nodes:

The Network infrastructure is managed from Controlling Nodes. In an Enterprise environment, Controlling Nodes are typically Linux bastion servers.

Managed Nodes:

Managed Nodes are the Network Devices that is being managed by the Controlling Nodes. Managed Nodes are typically of Cisco, Juniper, and Arista make and can be classified as Switches, Routers, Firewalls and Load Balancers based on their function from a Network Engineer’s perspective.

Why ?

There are many automation tools like Chef, Puppet, CFEngine but in my opinion, Ansible is suited for Network Automation for the following reasons:

  1. Ansible does not require an agent to be installed in the Managed Node (Network Device).
  2. Ansible requires Python on the Managed Node and most Network Devices support Python.
  3. Ansible relies on YAML as the descriptive language and Jinja2 for templates.

Among the points mentioned above, most Network Vendors do not support the installation of agents and even if they did support the installation, it would be tough to get the relevant permissions within an organization to install the agents in an Enterprise environment that has different Network Teams managing different aspects of the infrastructure.

Fortunately, most network vendors provide native support for Python and Ansible rely on this to execute automation tasks on the “Managed Nodes”.

As a Network Engineer working in an environment with significant scale (1,000s of Network Devices across multiple datacenters), Ansible has been quite useful in obtaining data and deploying configuration. Ansible seems to have widespread support among the Network Engineers seeking automation to manage at scale and there are resources online that can be leveraged to implement Network Automation Solutions.

F5 iRule – URI Masking

Requirement:

Client sends request to http://xyz.com/

Server needs to process http://xyz.com/append but client should only see http://xyz.com/ i.e., the URI  /append should not be visible to the client.

when HTTP_REQUEST { 
if { ([HTTP::host] equals "www.xyz.com") and ([HTTP::uri] eq "/") } { 
HTTP::uri "/append" 
} 
} 
when HTTP_RESPONSE {
if { [HTTP::header values Location] contains "/append" } {
HTTP::header replace Location [string map {/append /} [HTTP::header value Location]]
}
}

The F5 will complete the following steps using the iRule provided above:

F5 will add URI “/append” to the incoming request.

F5 will replace “/append” with “/” in the response from the server to the client.

Reference:

Mask URI – Devcentral Thread

Github

SRX Performance Testing

For a project, I had to gather data in order to measure the SRX performance under different traffic loads. This was done for a SRX chassis system in cluster mode. The following commands were utilized in order to capture metrics when the system was subjected to traffic load:

show system uptime 
show system statistics
 
show chassis routing-engine 
show chassis fpc 
show chassis cluster status 
show chassis cluster information 
show chassis cluster interfaces 
 
show security monitoring 
show security monitoring performance spu 
show security flow statistics

show interfaces reth0 statistics
show security monitoring fpc node all <fpc-slot>

Github with scripts to execute the above commands and save it in a file.

Ansible – Cisco Config Implementation

The goal of this article is to explain configuration implementation on Cisco IOS after the config has been generated as shown in Ansible Config Generator III:

config-implementation.yml

---
- hosts: switch
  gather_facts: true
  connection: local
 
  tasks:
  - name: OBTAIN LOGIN CREDENTIALS
    include_vars: secrets.yml
 
  - name: DEFINE PROVIDER
    set_fact:
      provider:
        host: "{{ inventory_hostname }}"
        username: "{{ creds['username'] }}"
        password: "{{ creds['password'] }}"
        auth_pass: "{{ creds['auth_pass'] }}"
 
  - name: CONFIGURE INTERFACE
    ios_config:
      provider: "{{ provider }}"
      authorize: yes
      lines:
        - "{{ lookup('file', './config-output/{{ inventory_hostname }}.conf') }}"

The “hosts” file contains the switch names:

[switch]
switch-1
switch-2
switch-3
switch-4

secrets.yml

---
creds:
  username: cisco
  password: ciscopassword
  auth_pass: ciscoauthpassword

“./config-output/” directory contains the following files:

switch-1.conf
switch-2.conf
switch-3.conf
switch-4.conf

The inventory file (hosts) contains the switches in which the configuration from corresponding file name will be added. “switch-1.conf” file contents will be utilized to configure “switch-1” so on and so forth.

~/ansible_play/config-output$ cat switch-1.conf 
            
int Gi1/1                   
switchport access vlan 195
            
int Gi1/10                   
switchport access vlan 188
 
~/ansible_play/config-output$ cat switch-2.conf 
            
int Gi1/2                   
switchport access vlan 295
            
int Gi1/20                   
switchport access vlan 288
 
~/ansible_play/config-output$ cat switch-3.conf 
            
int Gi1/3                   
switchport access vlan 395
            
int Gi1/30                   
switchport access vlan 388
 
~/ansible_play/config-output$ cat switch-4.conf 
            
int Gi1/4                   
switchport access vlan 495
            
int Gi1/40                   
switchport access vlan 488

Ansible – Config Generator III

For Part I & Part II of this series.

The goal of this playbook is to be able to generate unique configuration for each switch. In this case, we are configuring a port to work as an access-port for a specific vlan. The port and vlan variable is different for each switch.

config-gen.yml

---
- hosts: 127.0.0.1
  connection: local
  gather_facts: no

  tasks:
  - name: GET DATA
    include_vars: ./host_vars/file.yml

  - name: GENERATE CONFIG
    template:
      src: ./templates/accessvlan.j2
      dest: ./config-output/{{ item.switch }}.conf
    with_items: "{{ file_vlan }}"

file.yml

---
file_vlan:
- { switch: switch-1, port: Gi1/8,  vlan: 395 }
- { switch: switch-1, port: Gi1/23, vlan: 388 }
- { switch: switch-2, port: Gi1/8,  vlan: 395 }
- { switch: switch-2, port: Gi1/23, vlan: 388 }
- { switch: switch-3, port: Gi1/9,  vlan: 395 }
- { switch: switch-3, port: Gi1/24, vlan: 388 }
- { switch: switch-4, port: Gi1/9,  vlan: 395 }
- { switch: switch-4, port: Gi1/24, vlan: 388 }

accessvlan.j2

{% for grouper, host in file_vlan|groupby('switch') %}
{% if item.switch == grouper %}
{% for item in host %}            
int {{ item.port }}                   
switchport access vlan {{ item.vlan }}
{% endfor %}
{% endif %}
{% endfor %}

When the “config-gen.yml” playbook is executed:

ansible-playbook config-gen.yml

we get the following output files:

~/ansible_play/config-output$ cat switch-1.conf 
            
int Gi1/1                   
switchport access vlan 195
            
int Gi1/10                   
switchport access vlan 188
 
~/ansible_play/config-output$ cat switch-2.conf 
            
int Gi1/2                   
switchport access vlan 295
            
int Gi1/20                   
switchport access vlan 288
 
~/ansible_play/config-output$ cat switch-3.conf 
            
int Gi1/3                   
switchport access vlan 395
            
int Gi1/30                   
switchport access vlan 388
 
~/ansible_play/config-output$ cat switch-4.conf 
            
int Gi1/4                   
switchport access vlan 495
            
int Gi1/40                   
switchport access vlan 488

Ansible – Config Generator – II

For the first part of this series, check this – 1st part of this series.

---
- hosts: local
  connection: local
  gather_facts: no

  tasks:
  - name: GET DATA
    include_vars: ./host_vars/file.yml

  - name: GENERATE CONFIG
    template:
      src: ./SVI.j2
      dest: ./{{ item.vlan }}.conf
    with_items: "{{ file_vlan }}"

 

This is the file.yml that is being referenced in the “include_vars”

---
file_vlan:
- { vrf: NET1, vlan: 502, vlanname: VLAN-502-NAME, net: 10.80.120.128/29 }
- { vrf: NET1, vlan: 503, vlanname: VLAN-503-NAME, net: 10.80.120.136/29 }

Compared to the 1st part of this series, we are moving the contents of the “with_items” to a separate YAML file and calling it based on the variable name “file_vlan” that is part of the file content.

MSS & MTU

Always, MTU > MSS

Without any TCP or IP Options in the packet:

MSS + 40 Bytes = MTU

The 40B is used to represent 20B of IP header and 20B of TCP header. The addition in number of bytes can be more than 40B depending on options in the header.

Standard Ethernet Frame: 1500B. Ethernet Header and CRC: 18B. MTU refers to Ethernet Payload without the header and CRC.

Ethernet Fram Size = Header + Standard Ethernet Fram (1500B) + CRC

Reference Link.

F5 – SSL Cert Expiration

K14318 – Identifying expired certs and certs about to expire in 30 days.

K15288 – Email reminder for cert expiration.

A few one-liners from bash to identify the cert expiration date:

Identifying the expiration date from the certificate name:

~ # tmsh list sys file ssl-cert domain.crt | grep expiration
    expiration-date 1505951999
    expiration-string "Sep 20 23:59:59 2017 GMT"

 

Identifying the Client SSL profile for a certificate:

~ # tmsh list ltm profile client-ssl one-line | grep domain.crt | awk '{print $3,$4}'
    client-ssl CLIENTSSL-domain.com

 

Identifying the Virtual Server from Client SSL profile:

~ # tmsh list ltm virtual one-line | grep CLIENTSSL-domain.com | awk '{print $2,$3}'
    virtual VS-10.10.10.10-Public

 

Identifying the expiration date for cert associated with VS:

~ # echo | openssl s_client -connect 10.10.10.10:443 2> /dev/null | openssl x509 -noout -dates
notBefore=Nov 21 00:00:00 2016 GMT
notAfter=Nov 22 23:59:59 2017 GMT