How do you start VCS cluster if its not started automatically after the server reboot? Have you ever faced such issues ? If not just see how we can fix these kind of issues on veritas cluster. I have been asking this questions on the Solaris interviews but most of them are fail to impress me by saying some unrelated things with VCS stuffs. If you know the basic of veritas cluster, it will be so easy for to troubleshoot in real time and easy to explain on interviews too.
Two nodes are clustered with veritas cluster and you have rebooted one of the server. Rebooted node has come up but VCS cluster was not started (HAD daemon). You are trying to start the cluster using “hastart” command , but its not working.How do you troubleshoot ?
Here we go.
1.Check the cluster status after the server reboot using “hastatus” command.
# hastatus -sum |head Cannot connect to VCS engine
2.Trying to start the cluster using hastart . No Luck. ? Still getting same message like above ? Proceed with Step 3.
3.Check the llt and GAB service. If its in disable state, just enable it .
[root@UA~]# svcs -a |egrep "llt|gab" online Jun_27 svc:/system/llt:default online Jun_27 svc:/system/gab:default [root@UA~]#
4.Check the llt(heartbeat) status. Here LLT links looks good.
[root@UA ~]# lltstat -nvv |head LLT node information: Node State Link Status Address 0 UA2 OPEN HB1 UP 00:91:28:99:74:89 HB2 UP 00:91:28:99:74:BF * 1 UA OPEN HB1 UP 00:71:28:9C:2E:OF HB2 UP 00:71:28:9C:2F:9F [root@UA ~]#
5.If the LLT is down ,then try to configure using “lltconfig -c” command to configure the private links. Still if you have any issue with LLT links, then need to check with network team to fix the heartbeat links.
6.check the GAB status using “gabconfig -a” command.
[root@UA ~]# gabconfig -a GAB Port Memberships =============================================================== [root@UA ~]#
7.As per the above command output, memberships are not seeded. We have to seed the membership manually using gabconfig command.
[root@UA ~]# gabconfig -cx [root@UA ~]#
8. Check the GAB status now.
[root@UA ~]# gabconfig -a GAB Port Memberships =============================================================== Port a gen 6d0607 membership 01 [root@UA ~]#
Above output Indicates that GAB(Port a) is online on both the nodes. (0 , 1). To know which node is “0” and which node “1” , refer /etc/llthosts file.
9.Try to start the cluster using hastart command.It should work now.
10.Check the Membership status using gabconfig.
[root@UA ~]# gabconfig -a GAB Port Memberships =============================================================== Port a gen 6d0607 membership 01 Port h gen 6d060b membership 01 [root@UA ~]#
Above output Indicates that HAD(Port h) is online on both the nodes. (0 , 1).
11.Check the cluster status using hastatus command. System should be back to business.
[root@UA ~]# hastatus -sum |head -- SYSTEM STATE -- System State Frozen A UA2 RUNNING 0 A UA RUNNING 0 -- GROUP STATE -- Group System Probed AutoDisabled State B ClusterService UA Y N ONLINE B ClusterService UA2 Y N OFFLINE [root@UA ~]#
This is very small thing but many of the VCS beginners failed to fix this start-up issues. In interviews too ,they are not able say that ,” If the HAD is not starting using “hastart” command , I will check the LLT & GAB services and will fix any issues with that.Then i will start the cluster using hastart” As an interviewers , everybody will expect this answers.
Hope this article is informative to you .
Share it ! Comment it !! Be Sociable !!!