Multipath

1. Multipath Topology (read from multipath -l output)

The 4-number notation of paths separated by colons is host (i.e. HBA) number, channel (always 0 in our shop since we always use single channel HBAs), SCSI target which represents switch in our case, and LUN.

[root@dcdrpcora9 ~]# multipath -l
asm_vol2 (36005076801870036a000000000000d57) dm-3 IBM,2145
size=250G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| |- 3:0:0:0 sdc        8:32  active undef running
| `- 2:0:1:0 sde        8:64  active undef running
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 2:0:0:0 sda        8:0   active undef running
  `- 3:0:1:0 sdg        8:96  active undef running
asm_vol1 (36005076801870036a000000000000d58) dm-2 IBM,2145
size=250G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| |- 2:0:0:1 sdb        8:16  active undef running
| `- 3:0:1:1 sdh        8:112 active undef running
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 3:0:0:1 sdd        8:48  active undef running
  `- 2:0:1:1 sdf        8:80  active undef running
The following diagram graphically represents the topology given by the multipath -l shown above. Red means HBA card (as in /sys/class/fc_host/host? or /sys/class/scsi_host/host?). Since we only use fibre channel HBAs to make device mapper multipaths, only HBAs 2 and 3 are shown; HBAs 0 and 1 are not fibre channel cards. Channel numbers are ignored; they are all 0. The two switches are shown in purple, each providing 4 paths going to the two storage LUNs in green, two paths coming from HBA2 and two from HBA3.
               HBA2 HBA3       <-- FC host or HBA
               /  \ /  \
              /    V    \
             /    / \    \
           [ SW0 ]   [ SW1 ]   <-- SCSI target or our switch
           /\   /\   /\   /\
          0  1 0  1 0  1 0  1  <-- LUN
        sda  b c  d e  f g  h  <-- path
         /    V    V    V    \
        /    / \  / \  / \    \
       /    /   \/   \/   \    \
      /    /    /\   /\    \    \
     /    /    /  \ /  \    \    \
     ------------- V -------------
     |   LUN0    |/ \|   LUN1    |   <-- LUN
     |  asm_vol2 |   |  asm_vol1 |
     -------------   -------------
    sda,sdc,sde,sdg  sdb,sdd,sdf,sdh <-- path
Take the first path in the multipath -l output for an example, path 3:0:0:0 sdc. It originates from HBA3, going through channel 0 (not shown in the diagram), switch 0, ending at LUN0, which is asm_vol2. Look at the first path for asm_vol1 in the output, 2:0:0:1 sdb. It starts at HBA2, goes to channel 0 (not shown), switch 0, and ends at LUN1, i.e. asm_vol1.

2. Script to check multipath failures

#!/usr/bin/perl -w
#ck_multipaths.pl: Check active multipaths, alert if less than 4 paths (Yong 2013,2014)
#assume mapper device named like ^asm; if not, adjust regexp pattern as needed

$RECIPIENT='you@example.com,yourbuddy@example.com';
$LOGFILE='/root/ck_multipaths.log';
$LOGFILEHIST='/root/ck_multipaths.hist'; #accumulated history

$HOSTNAME=qx(/bin/hostname -s);

@mps = split /\n/, qx(/sbin/multipath -l);

sub process_mp
{ print "$mp has $cnt active paths.\n"; #path count of last, not this, mp in the loop
  if ($mp=~/^asm/ and $cnt<4)
  { $TM=qx(/bin/date "+%Y%m%d %H:%M"); chomp $TM;
    open LOG, ">>$LOGFILE" or die "Can't open $LOGFILE for write: $!";
    print LOG "$TM: $mp has $cnt active paths!\n";
    close LOG;
  }
}

system "/bin/cat $LOGFILE >> $LOGFILEHIST";
truncate "$LOGFILE", 0;
foreach(@mps)
{ if (/dm-/) #mp (multipath) header line
  { &process_mp if (defined $mp and defined $cnt);
    $cnt = 0;
    $mp = $_; #to be used for next line read
  }
  else
  { $cnt++ if /\d+:\d+ +\[?active/; #line pattern: "...major:minor active..." or "... [active"
  }
}

#the "finally" block
&process_mp if (defined $mp and defined $cnt);

system "/bin/mail -s \"Alert from $HOSTNAME\" $RECIPIENT < $LOGFILE" if -s $LOGFILE;

Yong Huang 2013,2014



My comments on multipath.conf settings
path_grouping_policy: When it's set to multibus for active/active devices, all paths are in 1 group, just like a hard disk has only C partition, easier to manage.
getuid_callout: Manually run the script to make sure it fetches wwid correctly.
features: Make very sure not to set queue_if_no_path to 1 for Oracle RAC; either set it to 0 or don't set features.
path_checker: Setting it to tur is for active/passive only.
failback: Must be immediate for fast failover
rr_min_io: Smaller value (than default 1000) may be better for OLTP? Note it's not rr_min_io requests, but that multipled by the priority value of requests, that must be done before switching path.
no_path_retry: Must be set to fail for Oracle RAC, according to numerous Oracle and Red Hat articles. Make sure it's not overridden in the more specific section below, such as devices{}.


Our case
Sep 03 2014 at 04:51 PM -04:00
Our test shows that with no_path_retry set to fail, features commented out (no need to set it to "0 queue_if_no_path"), and a few other parameters probably not very relevant (polling_interval=10, path_selector="round-robin 0", path_checker=readsector0, rr_min_io=100), we no longer get "multipathd blocked for xxx seconds" message and the server stays up.


References
Multipath Configuration Defaults
Documentation
FAQ

After you made changes to multipath settings, reload the map (multipath -r) and the multipathd service (service multipathd reload), and check

# multipathd -k
multipathd> show config
defaults {
        verbosity 2
        polling_interval 10
        udev_dir "/dev"
        multipath_dir "/lib64/multipath"
        path_selector "round-robin 0"
        path_grouping_policy multibus
        getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
        prio alua
        features "0"
...
You can also use this one-line command to do it: echo "show config" | multipathd -k

Some very preliminary notes:

login as: oracle
myhost ~ $ cd /sys/class/fc_remote_ports
myhost fc_remote_ports $ sudo multipath -l > /tmp/multipath.out
[sudo] password for oracle:
myhost fc_remote_ports $ head /tmp/multipath.out #see what the output looks like
ASM_DATA37_CPB (36005076801870036a000000000000e69) dm-242 IBM,2145
size=250G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=0 status=active
| |- 0:0:7:16 sdoo 129:320 active undef running
| `- 1:0:7:16 sduc 66:576  active undef running
`-+- policy='round-robin 0' prio=0 status=enabled
  |- 0:0:5:16 sdmg 69:384  active undef running
  `- 1:0:6:16 sdmb 69:304  active undef running
ASM_DATA22_CPB (36005076801870036a000000000000e5a) dm-155 IBM,2145
size=250G features='0' hwhandler='0' wp=rw
myhost fc_remote_ports $ grep -- '- [0-9]:0' /tmp/multipath.out | cut -c6-10 | sort | uniq -c #assume single digit host
     77 0:0:1 <-- this host-target combination is used 77 times to form LUNs
     77 0:0:2
      1 0:0:4 <-- this combination is only used once
     74 0:0:5
      1 0:0:6
     74 0:0:7
     77 1:0:0
     77 1:0:1
      1 1:0:4
      1 1:0:5
     74 1:0:6
     74 1:0:7
myhost fc_remote_ports $ ls
rport-0:0-0  rport-0:0-10  rport-0:0-2  rport-0:0-4  rport-0:0-9  rport-1:0-1   rport-1:0-11  rport-1:0-3  rport-1:0-8
rport-0:0-1  rport-0:0-11  rport-0:0-3  rport-0:0-8  rport-1:0-0  rport-1:0-10  rport-1:0-2   rport-1:0-4  rport-1:0-9
myhost fc_remote_ports $ ls rport-0:0-0
device        fast_io_fail_tmo  node_name  port_name   power  scsi_target_id  supported_classes
dev_loss_tmo  maxframe_size     port_id    port_state  roles  subsystem       uevent
myhost fc_remote_ports $ for i in */scsi_target_id; do echo -n "$i: "; cat $i; done
rport-0:0-0/scsi_target_id: -1 <-- not a real fibre channel target
rport-0:0-10/scsi_target_id: 6
rport-0:0-11/scsi_target_id: 7
rport-0:0-1/scsi_target_id: 0
rport-0:0-2/scsi_target_id: 1
rport-0:0-3/scsi_target_id: 2
rport-0:0-4/scsi_target_id: 3
rport-0:0-8/scsi_target_id: 4
rport-0:0-9/scsi_target_id: 5
rport-1:0-0/scsi_target_id: -1 <-- same here
rport-1:0-10/scsi_target_id: 6
rport-1:0-11/scsi_target_id: 7
rport-1:0-1/scsi_target_id: 0
rport-1:0-2/scsi_target_id: 1
rport-1:0-3/scsi_target_id: 2
rport-1:0-4/scsi_target_id: 3
rport-1:0-8/scsi_target_id: 4
rport-1:0-9/scsi_target_id: 5
myhost fc_remote_ports $ for i in */roles; do echo -n "$i: "; cat $i; done
rport-0:0-0/roles: Directory Server
rport-0:0-10/roles: FCP Target, FCP Initiator
rport-0:0-11/roles: FCP Target, FCP Initiator
rport-0:0-1/roles: FCP Target, FCP Initiator
rport-0:0-2/roles: FCP Target, FCP Initiator
rport-0:0-3/roles: FCP Target, FCP Initiator
rport-0:0-4/roles: FCP Target, FCP Initiator
rport-0:0-8/roles: FCP Target, FCP Initiator
rport-0:0-9/roles: FCP Target, FCP Initiator
rport-1:0-0/roles: Directory Server
rport-1:0-10/roles: FCP Target, FCP Initiator
rport-1:0-11/roles: FCP Target, FCP Initiator
rport-1:0-1/roles: FCP Target, FCP Initiator
rport-1:0-2/roles: FCP Target, FCP Initiator
rport-1:0-3/roles: FCP Target, FCP Initiator
rport-1:0-4/roles: FCP Target, FCP Initiator
rport-1:0-8/roles: FCP Target, FCP Initiator
rport-1:0-9/roles: FCP Target, FCP Initiator
myhost fc_remote_ports $ for i in */supported_classes; do echo -n "$i: "; cat $i; done
rport-0:0-0/supported_classes: unspecified
rport-0:0-10/supported_classes: Class 3
rport-0:0-11/supported_classes: Class 3
rport-0:0-1/supported_classes: Class 3
rport-0:0-2/supported_classes: Class 3
rport-0:0-3/supported_classes: Class 3
rport-0:0-4/supported_classes: Class 3
rport-0:0-8/supported_classes: Class 3
rport-0:0-9/supported_classes: Class 3
rport-1:0-0/supported_classes: unspecified
rport-1:0-10/supported_classes: Class 3
rport-1:0-11/supported_classes: Class 3
rport-1:0-1/supported_classes: Class 3
rport-1:0-2/supported_classes: Class 3
rport-1:0-3/supported_classes: Class 3
rport-1:0-4/supported_classes: Class 3
rport-1:0-8/supported_classes: Class 3
rport-1:0-9/supported_classes: Class 3
myhost fc_remote_ports $ grep tmo /etc/multipath.conf
    #fast_io_fail_tmo     5
myhost fc_remote_ports $ for i in */dev_loss_tmo; do echo -n "$i: "; cat $i; done #default 30 seconds
rport-0:0-0/dev_loss_tmo: 30
rport-0:0-10/dev_loss_tmo: 30
rport-0:0-11/dev_loss_tmo: 30
rport-0:0-1/dev_loss_tmo: 30
rport-0:0-2/dev_loss_tmo: 30
rport-0:0-3/dev_loss_tmo: 30
rport-0:0-4/dev_loss_tmo: 30
rport-0:0-8/dev_loss_tmo: 30
rport-0:0-9/dev_loss_tmo: 30
rport-1:0-0/dev_loss_tmo: 30
rport-1:0-10/dev_loss_tmo: 30
rport-1:0-11/dev_loss_tmo: 30
rport-1:0-1/dev_loss_tmo: 30
rport-1:0-2/dev_loss_tmo: 30
rport-1:0-3/dev_loss_tmo: 30
rport-1:0-4/dev_loss_tmo: 30
rport-1:0-8/dev_loss_tmo: 30
rport-1:0-9/dev_loss_tmo: 30
myhost fc_remote_ports $ for i in */fast_io_fail_tmo; do echo -n "$i: "; cat $i; done #default?
rport-0:0-0/fast_io_fail_tmo: off
rport-0:0-10/fast_io_fail_tmo: 5
rport-0:0-11/fast_io_fail_tmo: 5
rport-0:0-1/fast_io_fail_tmo: off
rport-0:0-2/fast_io_fail_tmo: 5
rport-0:0-3/fast_io_fail_tmo: 5
rport-0:0-4/fast_io_fail_tmo: off
rport-0:0-8/fast_io_fail_tmo: 5
rport-0:0-9/fast_io_fail_tmo: 5
rport-1:0-0/fast_io_fail_tmo: off
rport-1:0-10/fast_io_fail_tmo: 5
rport-1:0-11/fast_io_fail_tmo: 5
rport-1:0-1/fast_io_fail_tmo: 5
rport-1:0-2/fast_io_fail_tmo: 5
rport-1:0-3/fast_io_fail_tmo: off
rport-1:0-4/fast_io_fail_tmo: off
rport-1:0-8/fast_io_fail_tmo: 5
rport-1:0-9/fast_io_fail_tmo: 5


To my Computer Page