How to configure I/O fencing on Veritas cluster ?

Cloud_Devops

12 years ago

I/O fencing is required to protect the data corruption in Shared storage cluster environment. In Normal VCS operation,Veritas cluster nodes will be having heart beats and each system will send ICMP packets to ensure each machines are live and service group status is known.(Picture:1.0).I/O fencing behaves same way on system failure and cluster interconnect failure.It will not take any decision by assuming like something.The I/O fencing driver races for the control of the coordinator disks to form a perfect membership.Nodes that have departed from the cluster membership are not allowed access to the data disks until they rejoin the normal GAB and fencing membership and the service groups and started on those nodes.

In multinode cluster,the I/O fencing algorithm is designed to give priority to larger clusters in any arbitration scenario.For example,if a single node is separated from 8-node cluster due to an interconnect fault,the 7 node should continue to run.The fencing driver uses the concept of a majority cluster.The fencing algorithm determines if the number of nodes remaining in the cluster is greater than or equal to the number of departed nodes.If so the larger cluster is considered a majority cluster.The majority cluster begins racing immediately for control of the coordinator disks on any membership change.

1.0

Once the system failure is detected, heart beat will stop between the systems and VCS will initiate the failover the service group to live machine. (1.1)

1.1

I/O Fencing Service/Driver:
I/O fencing uses GAB port b for communication.Fencing is started with the vxfenconfig -c command. The fencing driver vxfen is started during system start up using /etc/rc2.d/s97vxfen.

Configuration Files for I/O fencing:
1./etc/vxfenbtab
2./etc/vxfendg
3./etc/vxfenmode

I/O fencing Implementation on Existing clusters

1.Configure coordinator diskgroup using SCSI-3 Persistent Reservations LUNS .(Most of the SAN supports SCSI-3 PR)

2.Set coordinator flag to the diskgroup

#vxdg -g fendg set coordinator=on

3.Fencing Supports DMP for coordinator and data disks.Set policy in /etc/vxfenmode.
Perform the below on each cluster nodes.

# mv  /etc/vxfenmode  /etc/vxfenmode.previous
# cp  /etc/vxfen.d/vxfenmode_scis3_dmp /etc/vxfenmode
# cat /etc/vxfenmode
vxfen_mode=scsi3
scsi3_disk_policy=dmp

4.Deport the diskgroup. “-t” turns off automatic importing when system starts.

# vxdg deport fendg
# vxdg -t import fendg
# vxdg deport fendg

5.Create /etc/vxfendg on all cluster nodes

# echo "fendg" > /etc/vxfendg

6.Test the diskgroup for I/O fencing compatibility.
Option “-r” will run the test in non-destructive mode.

Arena-Node2#/opt/VRTSvcs/vxfen/bin/vxfentsthdw -r -g fendg

VERITAS vxfentsthdw version 5.1 Solaris

The utility vxfentsthdw works on the two nodes of the cluster.
The utility verifies that the shared storage one intends to use is
configured to support I/O fencing. It issues a series of vxfenadm
commands to setup SCSI-3 registrations on the disk, verifies the
registrations on the disk, and removes the registrations from the disk.

The logfile generated for vxfentsthdw is /var/VRTSvcs/log/vxfen/vxfentsthdw.log.11074

Enter the first node of the cluster:
node1
Enter the second node of the cluster:
node2

********************************************

Testing node1 /dev/vx/rdmp/disk_4s2 node2 /dev/vx/rdmp/disk_7s2

Evaluate the disk before testing ........................ No Pre-existing keys
RegisterIgnoreKeys on disk /dev/vx/rdmp/disk_4s2 from node node1 ....... Passed
Verify registrations for disk /dev/vx/rdmp/disk_4s2 on node node1 ...... Passed
RegisterIgnoreKeys on disk /dev/vx/rdmp/disk_7s2 from node node2 ....... Passed
Verify registrations for disk /dev/vx/rdmp/disk_7s2 on node node2 ...... Passed
Unregister keys on disk /dev/vx/rdmp/disk_4s2 from node node1 .......... Passed
Verify registrations for disk /dev/vx/rdmp/disk_7s2 on node node2 ...... Failed

Unregistration test for disk failed on node node2.
Unregistration from one node is causing unregistration of keys from the other node.
Disk is not SCSI-3 compliant on node node2.
Execute the utility vxfentsthdw again and if failure persists contact
the vendor for support in enabling SCSI-3 persistent reservations

Removing test keys and temporary files, if any...

The above test should be password in a order to use the diskgroup as coordinator DG.
Note:As the above results,my test is failed due to SCSI-3 PR issue.

7.Start the fencing driver using script on all cluster nodes.

# /sbin/vxfen-startup

Node:/etc/vxfentab will be updated whenever fencing driver re-initializes.

8.Close the cluster configuration on all the cluster nodes.

# haconf -dump -makero

9.Perform the offline configuration (i.e modify main.cf using vi editor)
to add “UseFence = SCSI3” line under the bottom of cluster Name

cluster UNIXARENA
UserNames = { admin = “ABCDGFRFSLK.” }
Administrators = { admin }
ClusterAddress = “192.168.2.7″
UseFence = SCSI3
)

10.Stop VCS on all cluster nodes.

# hastop -all

11.Restart VCS on all the cluster nodes.

# hastart

12.Display the fencing membership status

# vxfenadm -d
I/O Fencing Cluster Information:
================================
Fencing Protocol Version: 201
Fencing Mode: SCSI3
Fencing SCSI3 Disk Policy: dmp
Cluster Members:
* 0 (arena1)
  1 (arena2)
RFSM State Information:
node 0 in state 8 (running)
node 1 in state 8 (running)

For your information,we can configure I/O fencing using CLI option as well,
./installvcs -fencing

Thank you for reading.