Viprion Chassis – Adding New Blade

Normally, when you add the new blade, the current master blade will synch it’s configuration onto the new blade. Make sure that the existing blade is master. Backup all relevant configuration on the device and off the device before adding new blade.

Make sure that the blades are same model and see SOL16992

Look at SOL13965 to identify the master blade.

Considerations when moving blade between chassis: SOL10541271

F5 – Automating CLI Execution

Purpose:
This is a really simple way to automate CLI command execution on multiple F5 devices using Bash & TCL scripting. The scripts have been tested on a linux and a mac machine.

How to use it:
There is a bash script (F5_Bash_v1) that is utilized to collect the username/password for F5 access. A text file (F5_Host.txt) that stores the management IP address of multiple F5 devices and a TCL script (F5_Out_v1.exp) that is used to execute CLI commands on the F5 devices.

The bash script is the master script that obtains the username/password and executes the TCL script for multiple F5 devices.

Setup:
On a linux machine that is utilized to connect to the F5 device:

#Create a directory
mkdir F5_Check

Within the “F5_Check” directory, create the following 3 files:
F5_Host.txt
F5_Bash_v1
F5_Out_v1.exp

File Content: F5_Host.txt contains the management IP of the F5 devices.
Example:

$ cat F5_Host.txt
10.12.12.200
10.12.12.201
10.12.12.202
10.12.12.203

File Content: F5_Bash_v1

#!/bin/bash
# Collect the username and password for F5 access
echo -n "Enter the username "
read -s -e user
echo -ne '\n'
echo -n "Enter the password "
read -s -e password
echo -ne '\n'

# Feed the expect script a device list & the collected username & passwords
for device in `cat ~/F5_Check/F5_Host.txt`; do
./F5_Out_v1.exp $device $password $user ;
done

File Content: F5_Out_v1.exp

#!/usr/bin/expect -f

# Set variables
set hostname [lindex $argv 0]
set password [lindex $argv 1]
set username [lindex $argv 2]

# Log results
log_file -a ~/F5_Check/F5LOG.log

# Announce which device we are working on and the time
send_user "\n"
send_user ">>>>>  Working on $hostname @ [exec date] <<<<<\n"
send_user "\n"

# SSH access to device
spawn ssh $username@$hostname

expect {
"no)? " {
send "yes\n"
expect "*assword: "
sleep 1
send "$password\r"
}
"*assword: " {
sleep 1
send "$password\r"
}
}

expect "(tmos)#"
send "sys\n"
expect "(tmos.sys)#"

send "show software\n"
expect "#"
send "exit\n"
expect "#"
send "quit\n"

expect ":~\$"
exit

F5 TMM Crash

We have a pair of F5 Viprions that are connected to Cisco Nexus 7K (Aggr A & B) switch as shown here:

network_diagram_1_1

TMM Crash:

The TMM Crashed in one of the F5 Viprions as the following conditions were met:

  1. Your BIG-IP system is processing a large amount of active connections.
  2. You attempt to display the connection table using the tmsh show sys connection command.
  3. You then attempt to cancel the tmsh show sys connection command by using the Ctrl+C key sequence while the command is still in the process of displaying the connection table.

SOL15246

When the Viprion is handling hundreds of thousands of connections and the “show sys  connection” is executed and subsequently cancelled with “Ctrl+C” before the connections are displayed, it will cause the TMM to crash. This is common to multi-blade system  like Viprion and single units.

BugID: 595773

For the Viprions, apart from the TMM crash, the “Ctrl+C” is not propagated to all the blades in the multi-blade chassis Viprion. This has been identified as BugID: 595773. This has been fixed in 11.5.6 code and it may be retroactively fixed in 11.5.4 + HF2 (not sure).

BugID: 579284

Under certain conditions, memory within mcpd can be corrupted. This memory corruption within mcpd has been identified as BugID: 579284. The previously stated BugID: 595773 will trigger BugID: 579284 resulting in memory corruption within mcpd.

The memory corruption was serious enough to cause loss of inter-blade connectivity and thus each blade was acting as a stand-alone system and this caused the packets to loop within the network.

This bug will probably be fixed in 12.x code.

Logs from the Viprion:

 May  3 16:20:15 slot1/LB1-domain.com err tmsh[29166]: 01420006:3: operation canceled
 May  3 16:20:31 slot3/LB1-domain.com crit tmm6[17982]: 01010020:2: MCP Connection aborted, exiting
 May  3 16:20:31 slot4/LB1-domain.com info bcm56xxd[9563]: 012c0012:6: Reprogram vDAG cmp state to 0xb for vtrunk default (previous state 0xf)
 May  3 16:20:31 slot3/LB1-domain.com info bcm56xxd[9919]: 012c0012:6: Reprogram vDAG cmp state to 0xb for vtrunk default (previous state 0xf)
 May  3 16:20:31 slot1/LB1-domain.com info bcm56xxd[8234]: 012c0012:6: Reprogram vDAG cmp state to 0xb for vtrunk default (previous state 0xf)
 ...
 May  3 16:20:31 slot4/LB1-domain.com info bcm56xxd[9563]: 012c0012:6: Reprogram vDAG cmp state to 0x2 for vtrunk default (previous state 0xa)
 May  3 16:20:31 slot1/LB1-domain.com info bcm56xxd[8234]: 012c0012:6: Reprogram vDAG cmp state to 0x2 for vtrunk default (previous state 0xa)
 May  3 16:20:31 slot4/LB1-domain.com info bcm56xxd[9563]: 012c0012:6: FFP HDAG installed for default (cmp state 0x2)
 May  3 16:20:31 slot1/LB1-domain.com info bcm56xxd[8234]: 012c0012:6: FFP HDAG installed for default (cmp state 0x2)
 
 ... and the blade logs a restart.

The following logs were identified in the Cisco Nexus 7K that was connected to the Viprion:

2016 May  3 16:20:26 switch-1 %FWM-2-STM_LOOP_DETECT: Loops detected in the network for mac 4111.3111.abc1 among ports Po66 and Po11 on vlan 100 - Disabling dynamic learning notifications for a period between 120 and 240 seconds on vlan 100
2016 May  3 16:20:33 switch-1 %FWM-2-STM_LOOP_DETECT: Loops detected in the network for mac 4111.3111.a6c1 among ports Po11 and Po66 on vlan 200 - Disabling dynamic learning notifications for a period between 120 and 240 seconds on vlan 200

Summary of the 2 conditions that we hit:

  1. TMM Crash because of the “Ctrl+C” used to break “show sys conn” command.
  2. Ctrl+C does not propagate to all the blades causing memory corruption resulting in loss of inter-blade connectivity and thus making the multi-blade Viprion to create a closed loop.

 

 

iRule – Redirects

#Different Redirects

########################
 302 Redirects
########################
when HTTP_REQUEST {
    HTTP::redirect https://www.domain.com/
}

########################
 301 Redirects
########################
when HTTP_REQUEST {
    HTTP::respond Location 301 https://www.domain.com/
}

############################
IF-Conditional Redirect:
############################
# Matching a condition
when HTTP_REQUEST {
    if {[HTTP::host] eq "domain.com"} {
        HTTP::respond Location 301 https://www.domain.com/
    }
}

# NOT matching a condition
when HTTP_REQUEST {
    if { not ([HTTP::host] eq "domain.com") } {
        HTTP::respond Location 301 https://www.domain.com/
    }
}

# Multiple conditions
when HTTP_REQUEST {
    if { ([HTTP::host] eq "domain.com") and ([HTTP::uri] eq "/login")} {
        HTTP::respond Location 301 https://www.domain.com/login/
    }
}

#if & elseif
when CLIENT_ACCEPTED {
   set default_pool [LB::server pool]
}
when HTTP_REQUEST {
    if { ([HTTP::host] eq "domain1.com") } {
        HTTP::respond Location 301 https://www.domain1.com/login/
    } elseif { ([HTTP::host] eq "domain2.com") } {
        HTTP::respond Location 301 https://www.domain1.com/login/
    } else {
        pool $default_pool
    }
}

##################################
Switch-Conditional Redirect:
##################################

#Check multiple unique domains
when CLIENT_ACCEPTED {
   set default_pool [LB::server pool]
}
when HTTP_REQUEST {
   switch -glob [HTTP::path] {
      "domain1.com" {
         HTTP::respond Location 301 https://www.domain1.com/
      }
      "domain2.com" {
         HTTP::respond Location 301 https://www.domain2.com/
      }
      default {
         pool $default_pool
      }
   }
}

#Redirect to same URL
when CLIENT_ACCEPTED {
   set default_pool [LB::server pool]
}
when HTTP_REQUEST {
   switch -glob [HTTP::path] {
      "domain1.com" -
      "domain2.com" {
         HTTP::respond Location 301 https://www.domain.com/
      }
      default {
         pool $default_pool
      }
   }
}

############################
Data Group
############################

class CLASS_HSF { 
   { 
      "/str1" { "domain1.com" } 
      "/str2" { "domain2.com" } 
   } 
}


when CLIENT_ACCEPTED {
set DEFAULT [LB::server pool]
}

when HTTP_REQUEST {                                                         
set HOST [string tolower [HTTP::host]]                                      
set URI [string tolower [HTTP::uri]]                                                               
                                                                            
if  { $HOST equals "www.domainhs.com" }{                                         
    HTTP::respond 301 Location "http://www.domain.com[HTTP::uri]"
    } elseif { [class match $URI starts_with CLASS_HSF] } {
        set DOMAIN [class match -value $URI contains CLASS_HSF]
        HTTP::respond 301 Location "http://$DOMAIN"
    } else {
        pool $DEFAULT
    }
}

Reference:

Github Link

 

F5 iRule – URI, Path & Query

F5 iRule has the following 3 command list that can be a bit confusing. This is a short post to remember the differences between the 3 of them.

[HTTP::uri] – everything from “/” after the domain name to the end.

[HTTP::path]– everything from “/” after the domain name to the character before the “?”

[HTTP::query]– everything after the “?”

[HTTP::host] – domain name

In short:
[HTTP::uri] == [HTTP::path] + ? + [HTTP::query]

Example:
http://www.example.com/main/index.jsp?user=test&login=check

 [HTTP::uri]   - URI:   /main/index.jsp?user=test&login=check
 [HTTP::path]  - PATH:  /main/index.jsp
 [HTTP::query] - Query: user=test&login=check
 [HTTP::host]  - Host:  www.example.com

F5 v11.x Device Trust Group

A week ago, I was upgrading HA F5 pair from 11.5.1 to 11.5.3 and noticed the existence of default “device_trust_group” in sync-only mode in GUI. I did not create it but it just showed up and there wasn’t a way to delete it. Apparently, this always existed in the background but was exposed via GUI in the later 11.x versions. Based on my experience, it wasn’t exposed via GUI in 11.5.1 but was exposed via GUI from 11.5.6

Device_Trust_Group

Reference: DevCentral

F5 Pool & Nodes

A Node is an IP address. Example: 10.10.10.10

A Pool Member is an IP Address + Port. Example: 10.10.10.10:8080

A Pool is a collection of Pool Members.

If you are managing an enterprise grade F5 infrastructure, there may come a time when you may have to replace a specific IP address with another IP address or replace multiple IP addresses in an F5 or multiple F5 devices.

This is a quick one-liner that will help you to identify all the pools that contain an IP address:

tmsh -q list ltm pool one-line | grep -E '($node_hostname|$node_ip)' | awk '{ print $3 }'

The above command should be run from “bash”.

Accessing F5’s bash:

root@LB1(/S1-green-P:Active)(tmos)# run util bash

[root@LB1:/S1-green-P:Active] ~ #

 NOTE: The “list ltm pool one-line” is available in 11.x code and not available in 10.x code. The command will list each pool in a single line.

An Example:

[root@LB1:/S1-green-P:Active] ~ # tmsh -q list ltm pool one-line | grep -E '10.10.10.19' | awk '{ print $3 }'
 POOL_ta_lt_http_private
 POOL_ta_lt_private
 POOL_ta_lt_public
 POOL_ta_lt-maintainance
 POOL_ta_lt-private
 POOL_ta_lt-public

Reference: Devcentral – Pool

What if you want the pool member alongside the pool ?

tmsh -q list ltm pool one-line | egrep -E "$check:[0-9]+"  | while read line; do myipport=$(echo $line | egrep -oE "$check:[0-9]+"); echo $line | awk '{printf "%s ",$3}'; echo "$myipport "; done

In the above line, replace “$check” with the IP Address that you are checking.

[root@LB1:/S1-green-P:Active] ~ #tmsh -q list ltm pool one-line | egrep -E "10.10.10.19:[0-9]+"  | while read line; do myipport=$(echo $line | egrep -oE "10.10.10.19:[0-9]+"); echo $line | awk '{printf "%s ",$3}'; echo "$myipport "; done

POOL_ta_lt_http_private 10.10.10.19:10542 
POOL_ta_lt_public 10.10.10.19:10253 
POOL_ta_lt_maintainance 10.10.10.19:10251 
POOL_ta_lt_private 10.10.10.19:10092 
POOL_ta_lt_public 10.10.10.19:10093

F5 CLI Display Length

While running cli commands on F5, you may run into display length issues:

(tmos.ltm)# show pool members | grep "10.10.10.10:"
Display all 158 items? (y/n) y

If you are executing the script on the F5 to obtain data, the "Display all xxx items? (y/n)" could be a problem. We can alter the display threshold using the following command:

(/Common)(tmos)# modify cli preference pager disabled display-threshold ?

Specifies the maximum number of objects that tmsh will display without requiring a user response to the question, "Display all <number> items? (y/n)". You can specify from 0 (zero) through 4,294,967,265 objects. 0 (zero) specifies that tmsh will display any number of objects without the warning.

(/Common)(tmos)#  modify cli preference pager disabled display-threshold 0