JMS and SAP, avoid this combination?

August 14th, 2008

I told a colleague “I’m going to use JMS topics to solve this problem”. His answer was: “SAP in combination with JMS… you don’t want to go there.”. Based on the fact that several SAP applications which use JMS seem to have complications… Is JMS and SAP a bad combination? Or at least a combination to avoid as developer?

SAP about their JMS implementation:

Full compliance with the JMS 1.0 specification. Software-based high availability solution.

Problem/task description

A short description of the problem/task:

The application’s caching mechanism caches retrieved data for 1 hour. We want it to cache until we tell it to clear it. If we change something in the back-end now, we want the cache to be cleared now. Don’t change the logic of the application too much

Logic at that moment

So what was the caching logic at that moment? A stateless session bean with static fields (class variables) which contained the cached objects. The same cache object was used in a normal class (pojo). A property file determined which was to be used.. none of the environments (development, quality, production) was configured to use the pojo. The cache was stored for one hour, no reset method.

Desired logic

Now we’re going to cache the entries… forever! Or at least until someone decides to reset it. Static objects exist in each JVM (actually per classloader… but for the sake of simplicity we assume per JVM), so per node there is a cache. The fact that there are multiple cache instances doesn’t matter, it has been like this for ages and was always sufficient. ‘Create a method to clear the cache, deploy and you’re done’. Not really, what will happen to a call to reset the cache? User clicks ‘Reset Cache’ (where does not really matter), the load balancer determines to which node the request for reset is headed. The load balancer decides node 1 is the lucky one… the cache on node 1 is cleared… How to make sure every node gets the request for reset? Sounds like something for JMS topics!

Why JMS topics?

How does it work? Each cache object is subscribed to the “cache reset”-topic. The application running on (for example) node 1 receives the request for reset and posts a message on the “cache reset”-topic. All cache objects, on every node, get the message (even the posting object) and clear their cache. Sounds good. It does not only sound good, it works as described. But it cost a bit more time than anticipated…

Problems

What problems did we run into? Mainly a OutOfMemoryException. It’s hard, but not impossible, to write a memory leak in Java. Research showed the JVM had tons of free memory, where the hell does this error come from then? It took not long to find out JMS had a limited amount of connections available (100 per node on our installation), and somehow we ran out of connections (and therefore throwing an OutOfMemoryException…?). Could it be that the amount of connections used till that moment was close to that 100? And that the connections from the cache object were just enough to fill it? A listing of JMS connections with their topics/queues quickly showed the caching objects were taking up 60 connections… on one node! My expectation was to have 2 connections, as there were 2 instances of object cache.

JMS connections listing

Example of displaying JMS connections, click for larger version

Analysis

Analysis showed the following behaviour

  1. Connections were created over time (not all at once)
  2. Starting/stopping the application does not close the connections, neither does a redeployment
  3. The objects belonging to the connections still responded to topic updates (even after stopping the application)

Clearly there was something wrong. What was the logic at that moment? The connection was closed in the finalize of the cache object.

protected void finalize() throws Throwable {
	if(topicConn != null){
		try {
			topicConn.stop();
		} catch (JMSException e) {
			e.printStackTrace();
		}
		try {
			topicConn.close();
		} catch (JMSException e1) {
			e1.printStackTrace();
		}
	}
	super.finalize();
}

So whenever the garbage collector decides to clean up the object, the connection is closed. You never know when the garbage collector is going to do its round… my idea was that when the JVM is running low on memory it would start and collect. That somehow does not go for our cache object, even when it’s throwing an OutOfMemoryException.

Where did all those connections come from?

As described above, the cache objects were used in a stateless session bean, and in a pojo. The session bean is created upon start of the application, the EJB container keeps at least one instance of this bean active for requests. If stopping/redeploying the application does not clean up the cache objects (and therefore the connections) I would expect just a few connections (should still be fixed!). So somehow the connections come from the pojo… but the application isn’t even configured to use that cache!
How was that implemented?

public class Mapping {
	private static ObjectCache metaObjectCache = new ObjectCache();
	private static ObjectCache dataObjectCache = new ObjectCache();
	....

We didn’t pay much attention to this class, until now. Every time the class was loaded, a static instance of ObjectCache was created! As long as the class isn’t loaded everything is fine, but as you can guess: it didn’t matter how the cache was configured in the properties file… a request always goes through this class. And thus a cache was created, a connection was opened. Fixing this was easy, initialize the cache objects null and do a check upon reading.

if(metaObjectCache==null){
	metaObjectCache = new ObjectCache();
}
...

This should fix the problem of more and more connections being opened. Still leaves open the problem that connections aren’t closed.

Cleaning up

You cannot rely on the garbage collector, I should have thought of that…
Documentation on the JMS Connection object:

Since a provider typically allocates significant resources outside the JVM on behalf of a connection, clients should close these resources when they are not needed. Relying on garbage collection to eventually reclaim these resources may not be timely enough.

Not timely enough, certainly not. After a week all connections were still open, the objects still responded on topic posts. Maybe they aren’t even eligible for garbage collection when the connection is still open. Solution? Close the connection as soon as it is no longer needed, sounds pretty obvious actually. How do we know it’s no longer needed? In our case it is whenever the stateless session bean gets removed (ejbRemove). The J2EE will keep at least one instance of this bean running as long as the application is active. The bean will be removed when stopping/redeploying the application, at that moment we want to close the JMS connection. If we don’t close it at that moment we’re too late and the objects get lost in cyberspace. As the J2EE container can create multiple instances of the bean when it feels the need to (many requests) we cannot just kill the connection as soon as ejbRemove is called. A simple solution for that is to keep track of the number of ‘clients’ in the cache.

	public void ejbRemove() {
		metaObjectCache.removeClient();
		dataObjectCache.removeClient();
	}

A similar solution was created for the pojo that contained the cache objects.
Closing the connection is the responsibility of the cache itself, the client told the cache object that it’s no longer interested, the cache object must decide if the connection can be closed.

	public void addClient(){
		numberOfClients++;
	}

	public void removeClient(){
		numberOfClients--;
		if(numberOfClients < 1){
			//no more clients... close everything
			this.cleanUp();
		}
	}

	private void cleanUp(){
		if(topicConn != null){
			try {
				topicConn.stop();
			} catch (JMSException e) {
				e.printStackTrace();
			}
			try {
				topicConn.close();
			} catch (JMSException e1) {
				e1.printStackTrace();
			}
		}
	}

Conclusion

  • Pay really, really, really good care to your JMS connections. Do NOT trust the garbage collector to collect the object containing the connection. Assume the object with the connection will NOT be garbage collected for the next couple of years.
  • Create the connection only when you will use it, yes… sounds obvious, read on. I created a connection in the constructor of the cache object, but missed a static field declaration of this cache object in an other object! Resulting in creation of connections at places/times you don’t want/expect it.
  • Find a place to close the JMS connection, in my case it was easiest to close it when the last stateless session bean is removed.

In other words, handle it like you would any (database) connection.

Leave a Reply