Monitoring Jboss
7:20 PMMonitoring Jboss
The Firepond application uses Jboss groups to enable clustering of services within Jboss across servers. Occassionally, a server may disconnect from a group. When the server does not reconnect, then problems with the application arise. Firepond requested Opsource assistence with detecting when a member of a Jboss group fails to reconnect.Requirements
Be able to detect when a server has not rejoined a Jboss group and generate an alert.Deliverables
Develop a script that would- detect when a group member disconnected from the group
- alert C3 of the failure
Materials
Jboss comes with a script called probe.bat which communicates to the groups via multicast. The output of the script shows the groups, the members the group believes are connected and the actual connections to the group.ActivePerl was installed on two staging boxes (iadfirewa01s and iadfirewa02s). The script was written in Perl.
Workflow
- Log on to the appropriate server and install ActiveState Perl. No additional Perl modules are required.
- Copy the probe.pl and probewrapper.bat scripts (attached to this page) to the C:\NOCpulse\bin directory on the server.
- Run the probe.pl command from the command prompt to verify it works.
- Create a "General : Remote Program with data" type check in
NOCpulse. Set the command to "probe.pl" and set the 'critical if
above' threshold to 0 (zero).
Modifying the Script
Adding or removing servers from monitored groups is a simple script edit. There are 2 arrays of IP addresses for each "super group" (Production servers and Symantec servers.) The first array is a list of all servers that belong in the "super group." The second array contains each sub-group (DefaultPartition, Tomcat-Cluster, OSCacheBus, etc.) which lists the IP addresses and ports for the servers that belong to those sub-groups. For example:# these arrays hold the known-good make-up of each group my @production_servers = qw(10.100.44.21 10.100.44.22 10.100.45.21 10.100.45.22 10.100.45.23 10.100.45.24); my %production_members = ( 'DefaultPartition' => [ qw(10.100.44.21:7800 10.100.44.22:7800) ], 'Tomcat-Cluster' => [ qw(10.100.44.21:7810 10.100.44.22:7810) ], 'OSCacheBus' => [ qw(10.100.44.21:7851 10.100.44.22:7851 10.100.45.21:7851 10.100.45.23:7851 10.100.45.24:7851) ], 'OnDemandProduction' => [ qw(10.100.44.21:7850 10.100.44.22:7850 10.100.45.21:7850 10.100.45.21:7858 10.100.45.22:7858 10.100.45.23:7850 10.100.45.24:7850) ], ); my @symantec_servers = qw(10.100.44.24 10.100.44.25 10.100.45.29 10.100.45.30 10.100.45.32 10.100.45.33 10.100.45.35); my %symantec_members = ( 'DefaultPartition' => [ qw(10.100.44.24:7800 10.100.44.25:7800) ], 'Tomcat-Cluster' => [ qw(10.100.44.24:7810 10.100.44.25:7810) ], 'OSCacheBus' => [ qw(10.100.44.24:7851 10.100.44.25:7851 10.100.45.30:7851 10.100.45.30:7853 10.100.45.33:7851) ], 'Symantec-Production' => [ qw(10.100.44.24:7850 10.100.44.25:7850 10.100.45.29:7858 10.100.45.30:7850 10.100.45.30:7852 10.100.45.32:7858 10.100.45.33:7850 10.100.45.35:7858) ], );
To add a server to the Symantec groups, you need to first add the IP to the @symantec_servers array. Simply add the IP address between the parenthesis, separating each IP address with a space. Next, use the probe.bat command and find the port that server uses to connect to the particular sub-group. When you find the correct port, add the IP address and port number to the proper sub-group by adding it between the parenthesis, in 'IP:port' format, and make sure each is separated by a space.
If we wanted to add the IP address 999.999.999.999 to the Symantec 'Tomcat-Cluster' group, and we found that it uses port 5555 by looking at the output of the probe command, the resulting arrays would look like this:
my @symantec_servers = qw(10.100.44.24 10.100.44.25 10.100.45.29 10.100.45.30 10.100.45.32 10.100.45.33 10.100.45.35 999.999.999.999); my %symantec_members = ( 'DefaultPartition' => [ qw(10.100.44.24:7800 10.100.44.25:7800) ], 'Tomcat-Cluster' => [ qw(10.100.44.24:7810 10.100.44.25:7810 999.999.999.999:5555) ], 'OSCacheBus' => [ qw(10.100.44.24:7851 10.100.44.25:7851 10.100.45.30:7851 10.100.45.30:7853 10.100.45.33:7851) ], 'Symantec-Production' => [ qw(10.100.44.24:7850 10.100.44.25:7850 10.100.45.29:7858 10.100.45.30:7850 10.100.45.30:7852 10.100.45.32:7858 10.100.45.33:7850 10.100.45.35:7858) ], );
0 comments