How to debug cluster communications in RHEL5?

12:05 PM

RHEL5 rhcs cluster nodes will try to fence each other if they lose multicast cluster communications between nodes.

In these types of problems the culprit is almost always a failure or configuration issue with multicast at the network switch.
Environment
Red Hat Cluster Suite
Diagnosing the problem
1. On all cluster nodes enable recognition of icmp broadcasts:
echo 0 >/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

2. On all cluster nodes run "cman_tool status" and note:

A. Number of cluster nodes.
B. The cluster multicast address:

[root@node5 ~]# cman_tool status
Version: 6.1.0
Config Version: 8
Cluster Name: rhel_cluster
Cluster Id: 31412
Cluster Member: Yes
Cluster Generation: 180
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Quorum: 1
Active subsystems: 9
Flags: 2node Dirty
Ports Bound: 0 11 177
Node name: node5
Node ID: 5
Multicast addresses: 239.192.122.47
Node addresses: 9.47.65.165

3. Do a 2-count multicast ping:

ping -c 2 -I bond0 -b -L 239.192.122.47

4. If there are no problems then you should see multicast responses from all other cluster nodes:

PING 239.192.122.47 (239.192.122.47) from 9.47.65.165 bond0: 56(84) bytes of data.
64 bytes from 9.47.65.166: icmp_seq=1 ttl=64 time=1.23 ms
64 bytes from 9.47.65.166: icmp_seq=2 ttl=64 time=2.28 ms

-- 239.192.122.47 ping statistics --
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 1.230/1.755/2.281/0.527 ms

5. If you don't get the proper responses, then:

A. Make sure that you previously ran the following on ALL the cluster nodes:

echo 0 >/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

B. If you confirmed that Step 5A was done, then the either a node or network
switch is misconfigured or there are hardware problems such as bad
network cables.

6. On all cluster nodes disable recognition of icmp broadcasts:

echo 1 >/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts

Alternative Method

'omping' can be started on both nodes, taking the the ip of the remote node and the own ip as parameters.

Reference : How do I test multicast?

Ensure you run the same command on both nodes at the same time as mentioned in the above reference URL to properly check multicasting between the nodes.

You Might Also Like

0 comments

Contact Form

Name

Email *

Message *

Translate

Wikipedia

Search results