Monday, December 27, 2010

Java DNS Cache

This one is for the record - an issue I had faced and meant to document, so here goes.

From our app (Java version used is 1.4.2_19), we connect to a GSS load balanced url which operates in a round-robin fashion. For e.g., if the url is http://www.twitter.com, which maps to 128.242.240.148/116/20, the app should get a different IP each time or at least most of the time. But it sticks to the first IP it discovers. The reason is Java's DNS cache.

Java caches DNS settings after the app accesses a url so that any subsequent operation uses the cached setting - in our case, if www.twitter.com resolves to 128.242.240.20 the first time, that value is stored in the DNS cache and everytime twitter is accessed, the IP it resolves to remains the same.

Sounds daft, right. Not quite. This cache is maintained in the InetAddress class - there is one each for successful and unsuccessful host name resolutions. The positive caching is there to guard against DNS spoofing attacks; while the negative caching is used to improve performance. By default, the result of positive host name resolutions are cached forever, because there is no general rule to decide when it is safe to remove cache entries. The result of unsuccessful host name resolution is cached for a very short period of time (10 seconds) to improve performance. Refer the javadoc for more on the above.

In cases like ours, where we don't want DNS caching forever, the solution is to set the TTL value for positive caching to a very low value. This much is well documented. There are 3 ways to set this.
- Set the value from command line using the setting -Dnetworkaddress.cache.ttl=x where x is the number of seconds for which the value is to be cached.
- Use the sun property setting. Works the same as above, only use -Dsun.net.inetaddr.ttl=x in lieu of -Dnetworkaddress.cache.ttl
- Modify the java.security file (Path is $JAVA_HOME/jre/lib/security/java.security) to set the value of networkaddress.cache.ttl to a low value. By default the setting is -1 (cache forever). Note that this setting would affect any application which uses this JDK.

Tried Option 1 - added the -Dnetworkaddress.cache.ttl=60 setting to the weblogic startup file, which should have cleared the cache every 60 seconds. But it did not have the intended effect. There is a related bug 6247501 reported, but that's Windows specific and we were on Unix, so shouldn't have affected us. Looked through the InetAddress caching mechanism, but could not quite figure out why it failed. Gave up after an hour.

Modified the JVM to use the sun setting -Dsun.net.inetaddr.ttl=60 and restarted the domain. Worked like a charm.

Can only conclude that -Dnetworkaddress.cache.ttl does not work from the command line but only from within the java.security file. If you do not wish to change the java.security setting, which would be the case if multiple domains reference the same JDK, a safer option is to use the -Dsun.net.inetaddr.ttl setting from the command line.

Happy caching.

Eclipse Memory Issues

JDK 5 was the default Java version in use, upgraded to JDK1.6.0_21-b06 today and Eclipse started tanking. Starts up fine but after a few minutes would fail with the error 'Internal plug-in action delegate error on creation. PermGen space'. Right, seemed to indicate that I wasn't allocating enough memory for permgen. But the same program works fine with the very same memory settings under JDK 5, so this was definitely an upgrade problem. So much for my support of auto updates :-|

The default setting in eclipse.ini was
--launcher.XXMaxPermSize
256m
-vmargs
-Dosgi.requiredJavaVersion=1.5
-Xms40m
-Xmx256m

And 256m is generally sufficient for permgen, so what gives. Checked the configuration file to confim if the settings had taken effect (Click on Help -> About Eclipse -> Installation Details -> Configuration). Shows up the Xms and Xmx, but no MaxPermGen, hmmm... interesting.
eclipse.vmargs=-Dosgi.requiredJavaVersion=1.5
-Xms40m
-Xmx256m

Googled a bit. The eclipse wiki indicated that it was a bug with Oracle/Sun JDK 1.6.0_21 (had to be my version, duhhh), being tracked as 319514. The bug link is a pretty interesting read.

Apparently, as part of Oracle's rebranding of Sun's products, the Company Name property of the java.exe file, the executable file containing Oracle's JRE for Windows, was updated from "Sun Microsystems" to "Oracle" in Java SE 6u21. Now on Windows, Sun VM is identified using the GetFileVersionInfo API, which reads the company name (present under version details) from jvm.dll or the java executable and compares it against the string "Sun Microsystems". Post update 21, the company name was "Oracle" and the launcher does not recognize this string, hence the -XX:MaxPermSize setting is not honoured.

The workarounds are
- Switch back to '1.6.0_20'
- Change the commandline for launching or add the following line after "-vmargs" to your Eclipse.ini file:-XX:MaxPermSize=256m
- For 32-bit Helios, download the fixed eclipse_1308.dll and place it into (eclipse_home)/plugins/org.eclipse.equinox.launcher.win32.win32.x86_1.1.0.v20100503
- Download and install any of the upgraded versions i.e. version 1.6.0_21-b07 or higher from the java site (alternative link is http://java.sun.com/javase/downloads/index.jsp). Make sure you have b07 or higher by running java -version.

Went with the last option and so far life's good. The MaxPermSize setting shows up under vm options
eclipse.vmargs=-Dosgi.requiredJavaVersion=1.5
-Xms40m
-Xmx256m
-XX:MaxPermSize=256m

But got this bad feeling that when I move to JDK 7, there'll be a newer set of problems cropping up (From the Oracle site: In consideration to Eclipse and other potentially affected users, Oracle has restored the Windows Company Name property value to "Sun Microsystems" in further JDK 6 updates. This value will be changed back to "Oracle" in JDK 7.) Sigh, so much for compatibility.