How to debug cluster communications in RHEL5?
12:05 PMRHEL5 rhcs cluster nodes will try to fence each other if they lose multicast cluster communications between nodes.
In these types of problems the culprit is almost always a failure or configuration issue with multicast at the network switch.
Environment
Red Hat Cluster Suite
Diagnosing the problem
1. On all cluster nodes enable recognition of icmp broadcasts:
echo 0 >/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
2. On all cluster nodes run "cman_tool status" and note:
A. Number of cluster nodes.
B. The cluster multicast address:
[root@node5 ~]# cman_tool status
Version: 6.1.0
Config Version: 8
Cluster Name: rhel_cluster
Cluster Id: 31412
Cluster Member: Yes
Cluster Generation: 180
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Quorum: 1
Active subsystems: 9
Flags: 2node Dirty
Ports Bound: 0 11 177
Node name: node5
Node ID: 5
Multicast addresses: 239.192.122.47
Node addresses: 9.47.65.165
3. Do a 2-count multicast ping:
ping -c 2 -I bond0 -b -L 239.192.122.47
4. If there are no problems then you should see multicast responses from all other cluster nodes:
PING 239.192.122.47 (239.192.122.47) from 9.47.65.165 bond0: 56(84) bytes of data.
64 bytes from 9.47.65.166: icmp_seq=1 ttl=64 time=1.23 ms
64 bytes from 9.47.65.166: icmp_seq=2 ttl=64 time=2.28 ms
-- 239.192.122.47 ping statistics --
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 1.230/1.755/2.281/0.527 ms
5. If you don't get the proper responses, then:
A. Make sure that you previously ran the following on ALL the cluster nodes:
echo 0 >/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
B. If you confirmed that Step 5A was done, then the either a node or network
switch is misconfigured or there are hardware problems such as bad
network cables.
6. On all cluster nodes disable recognition of icmp broadcasts:
echo 1 >/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
Alternative Method
'omping' can be started on both nodes, taking the the ip of the remote node and the own ip as parameters.
Reference : How do I test multicast?
Ensure you run the same command on both nodes at the same time as mentioned in the above reference URL to properly check multicasting between the nodes.
0 comments