F5 Upgrade from v10 to v11 – Lessons Learned

Pre-Maintenance Checks:

Make sure that the F5 is running in “Volume Partition” mode. “lvscan” within bash should provide output like this:

config # lvscan
ACTIVE ‘/dev/vg-db-sda/dat.share.1’ [30.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/dat.log.1’ [7.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/dat.swapvol.1’ [1.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.1.root’ [392.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.1._usr’ [2.48 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.1._config’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.1._var’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.2.root’ [256.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.2._usr’ [1.34 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.2._config’ [512.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.2._var’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.3.root’ [256.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.3._usr’ [1.34 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.3._config’ [512.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.3._var’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/dat.maint.1’ [300.00 MB] normal

Make sure there are no HTTP Classes in your configuration, other than the default “httpclass” by checking Local Traffic  ››  Profiles : Protocol : HTTP Class from the F5 GUI.

Make sure that there are no spaces in profile naming. SOL15144

As a precaution, go through your configuration and remove any unwanted/unused configuration elements like a “Test Virtual Server” or configuration from the past that is not in use at the moment.

Load the code version to the F5 LTM via GUI, SCP or any other preferred method.

Maintenance:

Before performing any F5 code upgrade, make sure that the “Service Check Date” on the device is AFTER the License Check Date for the new code version as listed here in SOL7727

If not, the maintenance would include a license re-activation step before proceeding with code upgrade. This step would take about 10-20 minutes.

cpcfg to the new code version location – Example: cpcfg HD1.2

Although “cpcfg HD1.x” has worked most of the times, I would recommend backing up the .UCS file in a remote location and also saving a copy in “/shared/tmp/<UCS File>“. After saving the UCS file in the “/shared/tmp/” location, you can utilize “load /sys ucs <path/to/UCS> no-license” to load the configuration as noted in SOL12880

Reboot. This will take about 20 minutes for the device to load the new configuration and come back up. If you are using HA F5, upgrade the Standby F5 first. It will take a few minutes for the Standby F5 to become “Active”. So, be patient.

Conservative estimate for the maintenance window is about 1 hour. I would recommend giving yourself 90 minutes, if you are not familiar with F5 code upgrade. Downtime can be minimized if you have BigIP F5 in High Availability Active/Standby Pair.

TCP Offloading, Windows 2012 Servers

If you are utilizing Windows 2012 Servers with TCP Offloading enabled and seeing network issues ranging from low throughput to loss in connectivity, the following articles provide recomendation on disabling TOE on the Servers:

http://www.rackspace.com/knowledge_center/article/disabling-tcp-offloading-in-windows-server-2012

http://www.rackspace.com/knowledge_center/article/installing-xenserver-tools-on-next-generation-windows-cloud-servers

Brocade ADX Crash

Brocade ADX can crash due to failure of Management Processor (MP) or Barrel Processor (BP). In order to narrow down the issue, check the output for the following commands:Brocade ADX Crash

Brocade ADX can crash due to failure of Management Processor (MP) or Barrel Processor (BP). In order to narrow down the issue, check the output for the following commands:

#show version

This will show you the uptime of the MP

#sh ver | i uptime

The system uptime is 5 days 12 hours 44 minutes 17 seconds

Log into “rcon virtual” and execute the following command:

#asm show version

This will show you the uptime of the BP

#asm sh ver

Copyright (c) 1996-2009 Brocade Communications Systems, Inc.Boot  SW: Version 12.04.00 Nov 21 2011 15:09:57 PST label: dob12400Monitor  SW: Version 12.04.00 Nov 21 2011 15:09:57 PST label: dob12400System  SW: Version 12.04.00 Jul  9 2013 16:28:27 PDT label: ASB12400h The system has been up for 1 hours 8 minutes 35 seconds

As seen in the following outputs taken from an ADX that crashed, the uptime for the MP is more than the uptime for the BP. This would indicate that the BP reloaded while the MP did not reload.

Once this has been narrowed down, capture the dump file by running the following command:

#dm save mp (For MP)

#dm save bp 1 1 (For BP 1)

Brocade ADX – JSession Persistence

This is an example for JSession ID based persistence for Brocade ADX:


csw-rule "JSESSION" header "cookie" pattern "JSESSIONID=" case-insensitive
csw-rule "URI" url pattern "JSESSIONID=" case-insensitive                 
                                                                          
csw-policy "CSW_JSESSION" case-insensitive                            
 match "JSESSION" persist offset 0 length 32 passive-persist              
 match "URI" persist offset 0 length 32 passive-persist                   
 default forward 1        

In the above policy, the persistence decision is made based on the first 33 characters (0-32) of the JSession ID. If this number has to be different, we would have to alter the CSW Policy to reflect the right number of characters. The csw-policy “CSW_JSESSION” would have to be utilized within the Virtual Server.

Programming Language Popularity

If you are new to programming and looking to figure out the “hot language” to master, this link could be of assistance.

The link provides the list of programming languages that is being actively searched online. It isn’t a perfect tool but it is a good start. Although, I am not a programming expert, it is certainly good to master “Object Oriented Programming”. This framework is utilized by many high level programming languages.

iRULE – String Usage


when HTTP_REQUEST {
set URI [string tolower [HTTP::uri]]

if {$URI starts_with "/m/" }{
set NEW_URI [string range [HTTP::uri] 2 end]
HTTP::respond 301 Location http://www.domain.com$NEW_URI
}
}


$ curl -Ik http://10.10.10.10/m/URI
HTTP/1.0 301 Moved Permanently
Location: http://www.domain.com/URI
Server: BigIP
Connection: Keep-Alive
Content-Length: 0

Replacing

set NEW_URI [string range [HTTP::uri] 2 end]

with

set NEW_URI [string trimleft [HTTP::uri] /m/]

will provide the following output:


$ curl -Ik http://10.10.10.10/m/URI
HTTP/1.0 301 Moved Permanently
Location: http://www.domain.comURI
Server: BigIP
Connection: Keep-Alive
Content-Length: 0

Note the lack of “/” in the “Location” header, just before the start of URI.

In order to retain the “/”, I did a string trimleft with “/m” instead of “/m/” i.e.,

set NEW_URI [string trimleft [HTTP::uri] /m]

instead of

set NEW_URI [string trimleft [HTTP::uri] /m/]

However, the result was the same.

Based on this devcentral article, it looks like “string range” is a better option as “string trim” tends to trim individual characters as instead of the “string of characters”.

Another issue to remember is that if we use “string trim”, we get the following output:

$ curl -Ik http://10.10.10.10/m/mURI
HTTP/1.0 301 Moved Permanently
Location: http://www.domain.comURI
Server: BigIP
Connection: Keep-Alive
Content-Length: 0

and “string range” provides the following:

$ curl -Ik http://10.10.10.10/m/mURI
HTTP/1.0 301 Moved Permanently
Location: http://www.domain.com/mURI
Server: BigIP
Connection: Keep-Alive
Content-Length: 0

So, if you are just looking to remove specific “string of characters”, “string range” is a better option based on testing on 11.x code version.

RackConnect v2 Terminology

If you are a customer of Rackspace and utilize RackConnect technology, you may have encountered Rackspace specific terminologies while interacting with Rackspace via Ticket/Phone. This post will provide some of the terminologies and explanations:

“EDGE Device” is the outermost network device (Firewall or Load Balancer) in your infrastructure.

“CONNECTED Device” is the network device (Firewall or Load Balancer) that is connected to the Public Cloud infrastructure.

“RC Environment” refers to the combination of “EDGE” and “CONNECTED” device and the cloud account in a particular Data Center. A customer can have multiple “RC Environment” within the same DC or in multiple DC.

Example:
Customer A can have “Primary(LON)” and “Seconday(LON)” – 2 RC Environments (Primary & Secondary) in the same DC – LON

Customer B can have “Primary(DFW)” and “Secondary(LON)” – 2 RC Environments (Primary & Secondary) in 2 different DC – DFW & LON

Traffic Flow between Cloud & Dedicated Servers in RCv2 & RCv3 is seen here:
http://www.rackspace.com/knowledge_center/article/comparing-rackconnect-v30-and-rackconnect-v20#i

Vyatta – VPN bug

After complete configuration of the site-site VPN, the following error would show up while running the command shown:

vyatta# run show vpn ipsec status
IPSec Process Running PID: 3300

1 Active IPsec Tunnels

IPsec Interfaces :
eth0    (no IP on interface statically configured as local-ip for any VPN peer)

After searching online, I came across the following community post that wasn’t of much help. Based on input from Brocade TAC Engineers, the above bug is superficial and doesn’t affect the functionality of the site-site IPSec tunnel configured on the Vyatta device. For now, this can be ignored as a benign bug.

Isakmp Keepalive – Cisco ASA & Checkpoint

Cisco ASA has Isakmp Keepalive Enabled by default. You can see this by running “show run all” and look under the tunnel-group configuration for the specific IPSec tunnel.

 

Default Setting for a tunnel-group:

tunnel-group 10.10.10.10 ipsec-attributes
ikev1 pre-shared-key *****
peer-id-validate req
no chain
no ikev1 trust-point
isakmp keepalive threshold 10 retry 2
no ikev2 remote-authentication
no ikev2 local-authentication

 

Configuration change required to disable isakmp:

tunnel-group 10.10.10.10 ipsec-attributes
isakmp keepalive disabled

 

After Change:

tunnel-group 10.10.10.10 ipsec-attributes
ikev1 pre-shared-key *****
peer-id-validate req
no chain
no ikev1 trust-point
isakmp keepalive disable
no ikev2 remote-authentication
no ikev2 local-authentication

 

Error Message Seen in the Cisco ASA Logs:

Jan 26 05:10:03 [IKEv1]IP = 10.10.10.10, Keep-alives configured on but peer does not support keep-alives (type = None)

 

The following is taken from Cisco documentation link provided:

“If you configure ISAKMP keepalives, it helps prevent sporadically dropped LAN-to-LAN or Remote Access VPN, which includes VPN clients, tunnels and the tunnels that are dropped after a period of inactivity. This feature lets the tunnel endpoint monitor the continued presence of a remote peer and report its own presence to that peer. If the peer becomes unresponsive, the endpoint removes the connection. In order for ISAKMP keepalives to work, both VPN endpoints must support them.

In some situations, it is necessary to disable this feature in order to solve the problem, for example, if the VPN Client is behind a Firewall that prevents DPD packets.

http://www.cisco.com/c/en/us/support/docs/security/asa-5500-x-series-next-generation-firewalls/81824-common-ipsec-trouble.html#solution07

In my experience, “Isakmp Keepalive” compatibility between vendors – Cisco & Checkpoint specifically doesn’t exist and it is better to disable it rather than leave it enabled on the Cisco ASA. If enabled between incompatible devices, it can lead to the tunnel dropping sporadically without reason.