Adding a Blade to a Viprion

Normally, when you add the new blade, the current master blade will synch it’s configuration onto the new blade. Make sure that the existing blade is master. Backup all relevant configuration on the device and off the device before adding new blade.

Make sure that the blades are same model and see K16992

Look at K13965 🙂 to identify the master blade.

Considerations when moving blade between chassis: K10541271

F5 Code Upgrade Steps

This is a rough template of F5 Code Upgrade steps that could be of help for your maintenance work.

  1. Before performing any F5 code upgrade, make sure that the “Service Check Date” on the device is AFTER the License Check Date for the new code version as listed here in SOL7727
  2. Upload the new code to the partition that you prefer on the F5.
  3. cpcfg to the new code version location – Example: cpcfg HD1.2

    Although “cpcfg HD1.x” has worked most of the times, I would recommend backing up the .UCS file in a remote location and also saving a copy in “/shared/tmp/<UCS File>“. After saving the UCS file in the “/shared/tmp/” location, you can utilize “load /sys ucs <path/to/UCS> no-license” to load the configuration as noted in SOL12880

  4. Reboot.This will take about 5-10 minutes for Hotfix updates and about 15-20 minutes when migrating major code version.

Recommended maintenance window is about 1 hour. This could change depending on any application level testing that you would like to incorporate within your maintenance window.

Reference:

F5 Code Upgrade – 10.x to 11.x

GCloud Init Error

$ gcloud init
WARNING: Could not setup log file in /home/johndoe/.config/gcloud/logs, (IOError: [Errno 13] Permission denied: ‘/home/johndoe/.config/gcloud/logs/2016.10.07/13.02.53.410759.log’)

Welcome! This command will take you through the configuration of gcloud.

ERROR: (gcloud.init) Failed to create the default configuration. Ensure your have the correct permissions on: [/home/johndoe/.config/gcloud/configurations]. Could not create directory [/home/johndoe/.config/gcloud/configurations]: Permission denied.

Please verify that you have permissions to write to the parent directory.

The following change helped resolve the error:

sudo chown johndoe -R /home/johndoe/.config/gcloud

Replace “johndoe” with the relevant username under which you are accessing the folder gcloud.

Paypal Recruiting Tool

Stumbled onto an interesting recruiting header from Paypal 🙂

$ curl -I https://www.paypal.com/uk/webapps/mpp/home
HTTP/1.1 200 OK
Server: Apache
X-Recruiting: If you are reading this, maybe you should be working at PayPal instead! Check out https://www.paypal.com/us/webapps/mpp/paypal-jobs
Paypal-Debug-Id: 453e6f27f1269
Cache-Control: no-cache
X-XSS-Protection: 1; mode=block
X-FRAME-OPTIONS: SAMEORIGIN

Viprion Chassis – Adding New Blade

Normally, when you add the new blade, the current master blade will synch it’s configuration onto the new blade. Make sure that the existing blade is master. Backup all relevant configuration on the device and off the device before adding new blade.

Make sure that the blades are same model and see SOL16992

Look at SOL13965 to identify the master blade.

Considerations when moving blade between chassis: SOL10541271

F5 – Automating CLI Execution

Purpose:
This is a really simple way to automate CLI command execution on multiple F5 devices using Bash & TCL scripting. The scripts have been tested on a linux and a mac machine.

How to use it:
There is a bash script (F5_Bash_v1) that is utilized to collect the username/password for F5 access. A text file (F5_Host.txt) that stores the management IP address of multiple F5 devices and a TCL script (F5_Out_v1.exp) that is used to execute CLI commands on the F5 devices.

The bash script is the master script that obtains the username/password and executes the TCL script for multiple F5 devices.

Setup:
On a linux machine that is utilized to connect to the F5 device:

#Create a directory
mkdir F5_Check

Within the “F5_Check” directory, create the following 3 files:
F5_Host.txt
F5_Bash_v1
F5_Out_v1.exp

File Content: F5_Host.txt contains the management IP of the F5 devices.
Example:

$ cat F5_Host.txt
10.12.12.200
10.12.12.201
10.12.12.202
10.12.12.203

File Content: F5_Bash_v1

#!/bin/bash
# Collect the username and password for F5 access
echo -n "Enter the username "
read -s -e user
echo -ne '\n'
echo -n "Enter the password "
read -s -e password
echo -ne '\n'

# Feed the expect script a device list & the collected username & passwords
for device in `cat ~/F5_Check/F5_Host.txt`; do
./F5_Out_v1.exp $device $password $user ;
done

File Content: F5_Out_v1.exp

#!/usr/bin/expect -f

# Set variables
set hostname [lindex $argv 0]
set password [lindex $argv 1]
set username [lindex $argv 2]

# Log results
log_file -a ~/F5_Check/F5LOG.log

# Announce which device we are working on and the time
send_user "\n"
send_user ">>>>>  Working on $hostname @ [exec date] <<<<<\n"
send_user "\n"

# SSH access to device
spawn ssh $username@$hostname

expect {
"no)? " {
send "yes\n"
expect "*assword: "
sleep 1
send "$password\r"
}
"*assword: " {
sleep 1
send "$password\r"
}
}

expect "(tmos)#"
send "sys\n"
expect "(tmos.sys)#"

send "show software\n"
expect "#"
send "exit\n"
expect "#"
send "quit\n"

expect ":~\$"
exit

F5 TMM Crash

We have a pair of F5 Viprions that are connected to Cisco Nexus 7K (Aggr A & B) switch as shown here:

network_diagram_1_1

TMM Crash:

The TMM Crashed in one of the F5 Viprions as the following conditions were met:

  1. Your BIG-IP system is processing a large amount of active connections.
  2. You attempt to display the connection table using the tmsh show sys connection command.
  3. You then attempt to cancel the tmsh show sys connection command by using the Ctrl+C key sequence while the command is still in the process of displaying the connection table.

SOL15246

When the Viprion is handling hundreds of thousands of connections and the “show sys  connection” is executed and subsequently cancelled with “Ctrl+C” before the connections are displayed, it will cause the TMM to crash. This is common to multi-blade system  like Viprion and single units.

BugID: 595773

For the Viprions, apart from the TMM crash, the “Ctrl+C” is not propagated to all the blades in the multi-blade chassis Viprion. This has been identified as BugID: 595773. This has been fixed in 11.5.6 code and it may be retroactively fixed in 11.5.4 + HF2 (not sure).

BugID: 579284

Under certain conditions, memory within mcpd can be corrupted. This memory corruption within mcpd has been identified as BugID: 579284. The previously stated BugID: 595773 will trigger BugID: 579284 resulting in memory corruption within mcpd.

The memory corruption was serious enough to cause loss of inter-blade connectivity and thus each blade was acting as a stand-alone system and this caused the packets to loop within the network.

This bug will probably be fixed in 12.x code.

Logs from the Viprion:

 May  3 16:20:15 slot1/LB1-domain.com err tmsh[29166]: 01420006:3: operation canceled
 May  3 16:20:31 slot3/LB1-domain.com crit tmm6[17982]: 01010020:2: MCP Connection aborted, exiting
 May  3 16:20:31 slot4/LB1-domain.com info bcm56xxd[9563]: 012c0012:6: Reprogram vDAG cmp state to 0xb for vtrunk default (previous state 0xf)
 May  3 16:20:31 slot3/LB1-domain.com info bcm56xxd[9919]: 012c0012:6: Reprogram vDAG cmp state to 0xb for vtrunk default (previous state 0xf)
 May  3 16:20:31 slot1/LB1-domain.com info bcm56xxd[8234]: 012c0012:6: Reprogram vDAG cmp state to 0xb for vtrunk default (previous state 0xf)
 ...
 May  3 16:20:31 slot4/LB1-domain.com info bcm56xxd[9563]: 012c0012:6: Reprogram vDAG cmp state to 0x2 for vtrunk default (previous state 0xa)
 May  3 16:20:31 slot1/LB1-domain.com info bcm56xxd[8234]: 012c0012:6: Reprogram vDAG cmp state to 0x2 for vtrunk default (previous state 0xa)
 May  3 16:20:31 slot4/LB1-domain.com info bcm56xxd[9563]: 012c0012:6: FFP HDAG installed for default (cmp state 0x2)
 May  3 16:20:31 slot1/LB1-domain.com info bcm56xxd[8234]: 012c0012:6: FFP HDAG installed for default (cmp state 0x2)
 
 ... and the blade logs a restart.

The following logs were identified in the Cisco Nexus 7K that was connected to the Viprion:

2016 May  3 16:20:26 switch-1 %FWM-2-STM_LOOP_DETECT: Loops detected in the network for mac 4111.3111.abc1 among ports Po66 and Po11 on vlan 100 - Disabling dynamic learning notifications for a period between 120 and 240 seconds on vlan 100
2016 May  3 16:20:33 switch-1 %FWM-2-STM_LOOP_DETECT: Loops detected in the network for mac 4111.3111.a6c1 among ports Po11 and Po66 on vlan 200 - Disabling dynamic learning notifications for a period between 120 and 240 seconds on vlan 200

Summary of the 2 conditions that we hit:

  1. TMM Crash because of the “Ctrl+C” used to break “show sys conn” command.
  2. Ctrl+C does not propagate to all the blades causing memory corruption resulting in loss of inter-blade connectivity and thus making the multi-blade Viprion to create a closed loop.

 

 

Concurrency Vs Parallelism

While trying to understand the difference between Concurrency & Parallelism, I came across this 30 minute talk by Rob Pike that clearly explains the differences.

My previous crude understanding of it was like this:

Usain Bolt’s personal best time for 400m & 100m is 45.28s & 9.28s respectively.

If we had 1 Usain Bolt running 400m at his personal best, we can cover 400m in 45.28s.

If we had 4 Usain Bolt clones running 4*100m in a relay style i.e., 1st 100m is covered by Usain Bolt-1 and the next 100m by Usain Bolt-2  and so on, it will take 4*9.58s = 38.32s to cover 400m. We save 6.96s (45.28-38.32) Concurrency !

If we had 4 Usain Bolt clones running 4*100m but this time simultaneously instead of the usual relay fashion, we should be able to cover 400m in 9.58s. We save 35.7s (45.28-9.28) : Parallelism !

iRule – Redirects

#Different Redirects

########################
 302 Redirects
########################
when HTTP_REQUEST {
    HTTP::redirect https://www.domain.com/
}

########################
 301 Redirects
########################
when HTTP_REQUEST {
    HTTP::respond Location 301 https://www.domain.com/
}

############################
IF-Conditional Redirect:
############################
# Matching a condition
when HTTP_REQUEST {
    if {[HTTP::host] eq "domain.com"} {
        HTTP::respond Location 301 https://www.domain.com/
    }
}

# NOT matching a condition
when HTTP_REQUEST {
    if { not ([HTTP::host] eq "domain.com") } {
        HTTP::respond Location 301 https://www.domain.com/
    }
}

# Multiple conditions
when HTTP_REQUEST {
    if { ([HTTP::host] eq "domain.com") and ([HTTP::uri] eq "/login")} {
        HTTP::respond Location 301 https://www.domain.com/login/
    }
}

#if & elseif
when CLIENT_ACCEPTED {
   set default_pool [LB::server pool]
}
when HTTP_REQUEST {
    if { ([HTTP::host] eq "domain1.com") } {
        HTTP::respond Location 301 https://www.domain1.com/login/
    } elseif { ([HTTP::host] eq "domain2.com") } {
        HTTP::respond Location 301 https://www.domain1.com/login/
    } else {
        pool $default_pool
    }
}

##################################
Switch-Conditional Redirect:
##################################

#Check multiple unique domains
when CLIENT_ACCEPTED {
   set default_pool [LB::server pool]
}
when HTTP_REQUEST {
   switch -glob [HTTP::path] {
      "domain1.com" {
         HTTP::respond Location 301 https://www.domain1.com/
      }
      "domain2.com" {
         HTTP::respond Location 301 https://www.domain2.com/
      }
      default {
         pool $default_pool
      }
   }
}

#Redirect to same URL
when CLIENT_ACCEPTED {
   set default_pool [LB::server pool]
}
when HTTP_REQUEST {
   switch -glob [HTTP::path] {
      "domain1.com" -
      "domain2.com" {
         HTTP::respond Location 301 https://www.domain.com/
      }
      default {
         pool $default_pool
      }
   }
}

############################
Data Group
############################

class CLASS_HSF { 
   { 
      "/str1" { "domain1.com" } 
      "/str2" { "domain2.com" } 
   } 
}


when CLIENT_ACCEPTED {
set DEFAULT [LB::server pool]
}

when HTTP_REQUEST {                                                         
set HOST [string tolower [HTTP::host]]                                      
set URI [string tolower [HTTP::uri]]                                                               
                                                                            
if  { $HOST equals "www.domainhs.com" }{                                         
    HTTP::respond 301 Location "http://www.domain.com[HTTP::uri]"
    } elseif { [class match $URI starts_with CLASS_HSF] } {
        set DOMAIN [class match -value $URI contains CLASS_HSF]
        HTTP::respond 301 Location "http://$DOMAIN"
    } else {
        pool $DEFAULT
    }
}

Reference:

Github Link