Labels
The following settings can be used for a general cluster configuration. Do note that you want to be EXTREMELY careful in setting high values here so as to not cause a non-detection event where the cluster does not failover as expected: Likewise, turning up the sensitivity may cause false failover events to happen when not needed:
- If any nodes are currently paused, unpause them all using the failover cluster MMC
- Set the following cluster properties on the passive node(s) from the command line:
cluster /prop SameSubnetDelay=1000
cluster /prop SameSubnetThreshold=10
cluster res ClusterName /priv HostRecordTTL=300
cluster res AnyOtherClusterName /priv HostRecordTTL=300 - Also set the following cluster properties for the individual node via the command line:
cluster /priv HeartBeatLostInterfaceTicks=10:DWORD
cluster /priv HeartBeatLostNodeTicks=10:DWORD - Stop and start the cluster service on each node to confirm the settings have taken effect
Do note that the general settings and how to set these are found at the following link below:
What has changed for Windows 2008 and the heartbeat network?
Speacial Note: Dynamic DNS and Windows 2008 Failover Clusters
If you are seeing random issues here where DDNS is failing even after the changes are made above you can try going the alternate route and performing the following changes:
- Change the cluster DNS registration TTL's to to reflect a (1) day expiration:
cluster res ClusterName /priv HostRecordTTL=86400
cluster res AnyOtherClusterName /priv HostRecordTTL=86400 - Uncheck the "Register this connection's addresses in DNS" in the advanced DNS properties tab on all NIC's for all nodes of a cluster
- Choose the "Disable NetBIOS over TCP/IP" option in the advanced WINS properties tab on all NIC's for all nodes of a cluster
NOTE - This change is basically making the DNS registration of the cluster's network name valid for the prescribed time frame. If the IP Address is changed manually on purpose, you may need to restart the cluster and/or clear out the DNS cache to ensure this gets updated to the latest IP Address. Also, proper testing should be performed to ensure this option is valid in any particular customer's environment.
Fixing 'too many open files' error
Once in a while we stumble upon 'too many open files' error in log files especially for java application servers. What this means is that you have hit a limit on number of open file descriptors (fd)/file handles assigned for the process or user the application server is running as.
Let's use tomcat for our example.
So to get the number of fd's used by tomcat we run
$ pgrep tomcat
12345
$ sudo ls /proc/
12345
/fd | wc -l
2040
Don't use `lsof | wc -l
` to get the number of fd's. `lsof
` lists much more than the number of fd's in use. If you look at the FD column you'll see memory mapped files, current working directory, root directoy, the text segment of the running process etc listed which don't utilize file descriptors.
To see the fd limit run `ulimit` as the application user
$ sudo -u tomcat sh -c
'ulimit -n'
2048
One can also check for soft limit(-S) and hard limit(-H)
$ sudo -u tomcat sh -c
'ulimit -Hn'
2048
To see the system wide fd limit
$ cat /proc/sys/fs/file-max
148922
Or
$ sysctl fs.file-max
fs.file-max =
148922
To find the current system wide fd usage
$ cat /proc/sys/fs/file-nr
1184
0
148922
Or
$ sysctl fs.file-nr
fs.file-nr =
1184
0
148922
The first field shows the number of allocated fd's, second field gives the number of allocated but unused fd's and the last field shows the maximum number of fd's. So in the above example 1184 fd's are allocated and all are in use.
So once you hit a limit the next step, obviously, is to increase it a bit. So let's do that.
For example to change the limit for tomcat user append the below lines to /etc/security/limits.conf
tomcat soft nofile
1024
tomcat hard nofile
3072
We can also put
ulimit -n
3072
in the shell script (startup wrapper script or rc script) that invokes the application server to change the limit for that process as all processes spawned from the shell will inherit the limit.
To increase the system wide limit we have to change the kernel runtime parameter fs.file-max
and save it in /etc/sysctl.conf
# echo
'fs.file-max = 200000'
>> /etc/sysctl.conf
# sysctl -p
# sysctl fs.file-max
fs.file-max =
200000
Steps to create a new user:
- Login into the target server that has access to modify Active Directory
- Shared AD Environment => Login into the appropriate admin station
- Dedicated AD Environment => Login into the customers domain controller
- Shared AD Environment => Login into the appropriate admin station
- Click on the start button and navigate to Programs | Administrative Tools and click on Active Directory Users and Computers
- In the left-hand pane of the console tree, right-click the folder in which you want to add a user account. Navigate to New and then click User
- In First name, type the user's first name
- In Last name, type the user's last name
- In Full name, type in the first initial of the first name and the full last name
- In User login name, retype the full name used above
- Click on Next
- Type in a suitable password using a password generator where possible
- Do not check the User must change password at next login check box
- Click on Next and then Finish
- Right-click the newly created user, navigate to and click on Properties
- Click on the Members of tab
- Click on Add
- Add any security groups that are needed by typing in the security group into the space provided and clicking on Check Names
- Click OK and OK to exit out
** *Note - The user creation process does not allow UNIX attributes to be edited unless you are in the Domain Admins security group. If this is the case you MUST use the command line script on the admin stations in IAD03 OR open a ticket with Operations Engineering. Otherwise, if the account will be a service account or if no UID is required, then you are done. If this is for a dedicated domain, please use the MMC as mentioned above in creating a new user.