Wednesday, November 5, 2008

Calculating a Multicast Address for Cluster Partitioning

Say you want to deploy multiple instances of an application on multiple machines, all using a common data store.  Further, you want to let them all talk to maintian some sort of state.  But you also want to have development and test clusters in the same subnet environment.  One common mechanism for cluster membership announcements is IP multicast.  This is one method that can be used by JGroups, and is the default used by EHCache.

But what if you don't want to have to bother remembering to configure the same/different addresses on each node to keep your test and development clusters independent?  That is just overhead grunt work that is prone to error, and can lead to data corruption if done incorrectly.

A simpler, automated approach is to calculate the multicast address on the fly, based on the configuration information you already have to configure by hand for the common data store.

For example, my company uses the JDBC URL for a MySQL database, combined with a user name and password for accessing it.  These are pieces of information necessary for each application node anyway, so using them to generate the multicast address removes the need to synchronize another configuration parameter.  In the case of EHCache with Hibernate, the multicast configuration parameters would even need to be in a different file than your main application configuration.  Too messy.

My assumption here will be that this is a cluster inside a private network, so the multicast address will come from the Administratively Scoped address space.  

This technique only varies the final two segments of the IP address.  I'll use the site-local scoped range of 239.555.*.* as my starting point, but any initial two-segment value in the administrative scope will work as a base address.

In Java, a String hash code is represented by an int, positive or negative.  The maximum positive value of an int is 2,147,483,647 (Integer.MAX_VALUE).  The address space available for site-local multicast is
255 * 255 * 65535 + 255 * 65535 + 65535 = 4,278,190,335 > 2,147,483,647
segment 3               segment 2       port
So, taking the absolute value of a String hash (with a sanity check for Integer.MIN_VALUE, as abs(MIN_VALUE) returns MIN_VALUE - see the JavaDoc) will be a fairly unique value.  Hash codes aren't truely unique, of course, but if you examine the String.hashcode() implementation, you will see that no two values with anything approaching similar contents or length can return the same hash code.  So unles you choose starting String values that may be wildly different both in content and length, this value should work just fine for ensuring unique multicast addresses.

Now for the mapping math.
int hash = "config-with-DBHost/Port/User/Pwd".hashcode();

// hash modulo max port
int port = hash % 65535

//hash divided by max port modulo max segment
int segment4 = hash / 65535 % 255

//hash divided by max port divided by max segment
int segment 3 = hash / 65535 / 255
Put the address together like this:
base.address.segment3.segment4
See how easy that was?  The key is to make sure the string you are using for your hash is the same on all nodes you want to share a given multicast address/port combination.  That means if something like database host is part of the string, make sure it isn't "localhost" on one node and a DNS name on all the others - that obviously will result in different hashes.

Thursday, October 16, 2008

Daily Remote Backups with Rsync

There are tons of pages on the web devoted to rsync scripts, but I had to pull together info from several sites to do what I wanted, so I'm adding another tutorial for a different basic use case.

My requirements:

  • Hands-off/lights-out operation - no human interaction needed to run it.
  • At least once a day, back up all changed files in a list of directories to a remote server.
  • Be able to install it on multiple machines easily.
  • Don't show a command window while backing up - it might scare my wife.
  • Run on Windows XP Home.
My solution combines rsync, Cygwin/bash, a DOS batch file, and the Windows Task Scheduler. It runs whenever scheduled, in a minimized command window, and synchronizes an arbitrary list of configured directories to a specified root location on a remote server.

Install Cygwin from the above link. Install rsync and OpenSSH. I know, the installer is confusing, but it is free.

On your server, presumably running some flavor of Unix/Linux with rsync available (many online hosting sites that provide shell accounts make this part of their default setup), do the following:
ssh-keygen -t rsa #type Enter to accept the default file name/location
cd ~/.ssh
cat id_rsa.pub >> authorized_keys
chmod 600 authorized_keys
This allows any machine with the newly generated key to log into the current shell account on the remote server without a password. So guard the id_rsa and id_rsa.pub files carefully!

Copy the id_rsa and id_rsa.pub files to the "source" computer - the one with the files you want to back up. Put them in ~/.ssh. You can do this with scp, the ssh-based secure remote copy command in Bash/Cygwin:
mkdir ~/.ssh
scp [user]@[host.domain]:~/.ssh/id_rsa* ~/.ssh
Now you should be able to log in from the local computer to the remote server without entering the password:
ssh [user]@[host.domain]
Create a directory in the primary user's home directory (NOT My Documents, one level above, in c:\Documents and Settings\[username]).

In this directory, create a file backup.sh with the following contents:
#!/bin/bash
# DOS command: c:\cygwin\bin\bash.exe -c "~/backup/backup.sh"
export PATH=$PATH:/usr/bin/
rsync -arzve ssh --delete --no-p --no-g --chmod=u+rw,g+r,o-rwx,Dug+rwx --modify-window=1 --files-from ~/backup/files.txt ~/ [user]@[host.domain]:~/[server_backup_root_path]

then make it executable:
chmod a+x backup.sh
This is the heart of the backup. See the rsync man page for option descriptions. Hopefully the parts in green are self-explanatory - these are the pieces specific to your backup server.

I needed the chmod parameter for my hosting service - may not be needed for all servers. The --modify-window=1 parameter came from somewhere on the web, I can't find it now, and was important to seeing the right list of changed files.

The ~/backup/files.txt referenced above is just a plain text file with one relative path per line - relative to the path following the --files-from parameter, in this case the user home directory. The simplest case might be a files.txt simply containing:
My Documents/
If you want to back up your backup script, add "/backup" to the list. If you don't want all of your "My Documents" folder, you can prune the selections by specifying just specific subfolders:
My Documents/My Music/
My Documents/My Pictures/
My Documents/personal/
In all cases, the trailing slash is important - I didn't bother figuring out why, it just is, and that was all I needed to know.

Now the scheduling part. First, we need a DOS batch file we can run minimized. Create "backup.bat" with these contents:
title Backing up files...
C:\cygwin\bin\bash.exe -c "~/backup/backup.sh > ~/backup/backup.log 2>&1"
This is just a cryptic command line to start the bash script from a DOS command prompt, sending all output to backup.log.

Now create a shortcut in the backup directory to the backup.bat file. I called mine "backupShortcut". In it's properties, set it to run minimized.

Go to Control Panels > Scheduled Tasks. Right click the white space in the window, and select New > Scheduled Task. Rename it to "daily backup" or something.

In the properties for the new task, use the following on the "Task" tab:
Run: "c:\Documents and Settings\[user]\backup\backupShortcut.lnk"
Start in:
"c:\Documents and Settings\[user]"
"Run as:" should be set to the current user. Click "Set Password..." to enter the user's login password. This lets the job run even if the user is not logged in or the computer is locked. Yes, this means you can't do this on a computer running Windows XP Home with no password on the primary account. But everyone should have a password, right?

Set the schedule to whatever you want. On my home computers, I do it at 2AM. On my work laptop, I do it at 2pm, because it is often suspended at night.

On the "Settings" tab, you may want to change the default length of time before the task is stopped, as the first backup may take a day or two, depending on your upload bandwidth. Comcast cable, with it's pokey 768K upload limit, can be really a dog the first time you back up pictures and music.

That should do it! Run the task manually to make sure a minimized window comes up and stays up. If there are problems, check the backup.log file for error messages.

I left out lots of DOS/bash basics, as this was a quick and dirty post to capture my work. I'll elaborate if anyone needs help.

Saturday, October 4, 2008

Ehcache RMI Address Determination is Error Prone

I had to fix this problem with Ehcache at work. We use it with Hibernate in a clustered environment (more on that in another post). Recently we noticed that sometimes not all peers would discover each other.

The problem exists in all versions from 1.2.4 on, and is still unchanged in the current 1.5 release here, on line 166. Can you spot the problem?
return InetAddress.getLocalHost().getHostAddress()
Turns out Java doesn't specify exactly what this means on a machine with multiple network interfaces, and the result can vary between machines with similar hardware and operating system (as we found).

one of our cluster nodes had a setup that returned an IPv6 interface first, the loopback interface second, and an external IPv4 interface third. Java returned the loopback address from the above method.

Discussion of this Java issue can be found in many threads on the web. The solution wasn't available until java 1.4, but since almost no one has to use anything less any more, it should be available to most applications. You need to implement your own subclass of net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory overriding doCreateCachePeerListener to return a new instance of a subclass of net.sf.ehcache.distribution.RMICacheManagerPeerListener which overrides calculateHostAddress to discover the local host address in a more flexible manner. Register the factory class in your ehcache.xml configuration file.

The new code can look something like this:
Enumeration interfaces;
try {
interfaces = NetworkInterface.getNetworkInterfaces();
} catch (SocketException e) {
throw new UnknownHostException("Error getting network interfaces: ".concat(e.getLocalizedMessage()));
}

if (interfaces == null) {
throw new UnknownHostException("No network interfaces found");
}

InetAddress addrToUse = null;

while(interfaces.hasMoreElements()) {
NetworkInterface i = interfaces.nextElement();
Enumeration addresses = i.getInetAddresses();
if (addresses == null) continue;

while(addresses.hasMoreElements()) {
InetAddress a = addresses.nextElement();
if (addrToUse == null && ! a instanceof Inet6Address) addrToUse = a;
}
}
if (addrToUse == null) {
throw new UnknownHostException("No IPv4 non-loopback address found for any interface.");
}
return addrToUse.getHostAddress();
This returns the String representation of the first IPv4, non-loopback address found. The assumption here is that Java will always return the interfaces and addresses in the same order, given the machine network settings haven't changed.

If you want to fall back on the loopback address, if found, modify the above code accordingly. I'll leave that as an exercise for the reader (oops, my math background is showing).

With this updated host lookup code our app didn't need different configuration files for each instance, and all nodes automatically discovered each other, even between machines with multiple network interfaces, such as modern rack servers with multiple NICs and automatic failover between them.

Eventually I'll post about all the pieces we use to run a single Java web application in multiple Tomcat instances on multiple servers with Hibernate and an L2 Ehcache on each one, without the overhead of Tomcat clustering or a full-blown application server. All nodes sharing the same database automatically discover eachother and keep their cache contents in sync without any special configuration on individual nodes. Pretty cool.