Wednesday, April 25, 2007

Understanding performance impact of shareable JDBC connections in struts based web application deployed on Websphere Application Server

The purpose of this article is to highlight the root cause of JDBC related performance issues that I have seen in multiple J2EE applications using struts framework. In many cases the root cause is the use of "shareable" connections, which is the default configuration in WebSphere.

The difference between shareable and unshareable connection is very nicely described in the following article.
Quoting from the above mentioned article an important point to note is that: -
"When the application closes a shareable connection, the connection is not truly closed, nor is it returned to the free pool. Rather, it remains in the Shared connection pool, ready for another request within the same LTC for a connection to the same resource."

This has serious impact if we are not using global or JTA transactions in our struts application. In that situation, WebSphere will create an LTC for each action forward. Let us take the following example of a simple search-results use case: -

1. A search request comes to the SearchAction.

2. SearchAction invokes "SearchHistoryDAO" to save the search parameters.
2.a) SearchHistoryDAO retrieves a connection from the pool, executes an insert and returns the connection back to the pool. (This takes 20ms)
2.b) SearchAction forwards the control to "SearchReultsAction"
3. SearchResultsAction invokes the search web service to perform the search and get the results. (This takes 3 seconds)
4. SearchResultsAction finally forwards the control to results.jsp to display the results to the user.
5. Control returns to SearchResultsAction
6. Control returns to SearchAction

In the above use case, if the connections are unshareable, then the connection fetched from the pool in step 2.a will be returned to the pool in around 20 ms in the same step. Thus it will be available for other threads to use in no time. "Connection use time" in this case will be 20 ms.

However, if the connections are shareable then the connection, even if closed will not be returned to the pool until the LTC of SearchAction is completed. That happens in step 6. So, the connection will not be available for other threads for around 3 seconds. The "connection use time" in this case will be more than 3 seconds.
This severely limits the number of connections available in the pool at any given time and does not allow the application to scale, maxing out the connection pool. With this configuration, there are many connections doing nothing while other threads (or the same one) wait for a connection.
This problem is compounded if each action in the chain performs a DB operation, and we have 3 or more actions in a chain. In this scenario, each HTTP request will require more than 1 simultaneous connection to process the request.

This issue manifests in the following way when "shareable" connections are used:-
1. Hung threads & deadlocks in Websphere
This situation occurs if the "ConnectionWaitTimeOut" parameter is left to the default in WebSphere and more than 1 action in an action chain requests a DB connection.
This parameter defines the time a thread will wait to obtain a connection from the pool before timing out, and the default is half an hour. Let us take an example where there are 10 connections in a pool. Now, if we send 10 simultaneous requests to the application server, each request will grab one connection and exhaust the pool. Then each action in the thread forwards the control to the next action in chain. That action will try to get a connection but none is available and so it will wait for half an hour to get it. However, no one will release it because it is held in the LTC of the previous action which will complete only when the current action waiting for the connection completes causing a deadlock.

2. JDBC Connection Wait Timeout exceptions
Incase the default value is changed from 30 minutes to a more reasonable 5-10 seconds, then deadlocks will not happen. In this case, at least one transaction will fail with a “ConnectionWaitTimeoutException” and deadlock will not happen.

3. Non-existent JDBC connection leaks
As described above, with shareable connections under load conditions the connection pool generally maxes out and we may start getting SQLException with the error " Unable to get a PooledConnection from the DataSource". This exception generally indicates that either the pool is too small, or there is a connection leak. Assuming that the pool size is set to a reasonable number, many times developers believe that this might be happening because of a connection leak and start reviewing code to find that "leak" wasting a lot of time and resources when there is no leak at all.

If you are facing any of the above mentioned issues, then it is highly recommended that you switch to the unshareable connections by configuring the same in web.xml as shown below.


Also ensure that you are using a local JNDI reference using java:comp/env for lookup. Using the global lookup by /jdbc/MyDS is deprecated and websphere will use the default value in that case which is shareable connections.

Saturday, April 14, 2007

Oracle Text Search Performance Tuning - From Minutes to Milliseconds

Recently I was asked to look into the stability aspects of a J2EE application. It had started to crash taking the Oracle DB server along with it during peak production load. An analysis of the Java Stack Trace indicated that almost all the hung threads were performing searches. Application was not able to take a load of 12 search requests per minute. As search request came into the system they kept on accumulating and degraded the performance of the system even more. Some of the searches ran for as long as 20 minutes. The search queries were using Oracle text search using the context index where the number of documents was around 900,000 and the size of the context index around 900MB. After a few load tests and analysis of the stats pack report what we found was that the search queries were spending more than 95% of time in I/O. So I focused the tuning exercise to reduce the disk I/O by reducing the buffer gets and increasing the buffer hit rate. This essentially means: -

  1. Reduce the number of rows on which oracle will operate to execute the query
  2. Reduce the size of each row
  3. Keep the most frequently accessed rows in memory

To achieve this; I did the following optimizations: -

1. Added the "FIRST_ROWS(n)" Hint to the query.
This documentation from oracle suggests that it should improve performance.

2. Generated a report on the Context index to discover any scope of optimizations
The report can be generated as described in the following section:-
This report gave very useful insights; particularly

  1. Number of $I rows, which was very large
  2. Most frequent tokens
  3. Fragmentation level of the index

Many of our queries searched on codes within the documents which have "/" as a part of the code. From this report, I found that Oracle by default breaks this one code into two tokens. For e.g. "A/B" which we intended to be stored as a single token was broken into two tokens "A" & "B". So now, oracle has to work on a lot more number of $I rows when we search on "A/B". Taking a lead from this I defined a printjoin with a "/" and a custom lexer to use the printjoin and rebuild the index.

3. Disabled "storage in row" for the CLOB
By default oracle stores CLOB data in-line; i.e. in the row if its size is less than 4K. So, if oracle fetches the row it fetches all the CLOB data as well. In our case most of the search text was less than 4k and so most of it was stored inline. This was one of the major contributors to the I/O. I disabled storage in row which greatly helped in reducing the size of individual row as well as the amount of I/O required. This is suggested in the following section of Oracle text search documentation.

4. Partitioned the table by Range and created a partitioned context index (search index)
I partitioned the table and the index by range as suggested by the following documentation.
This helped in reducing the number of rows on which oracle operates by leveraging partition pruning. Also, it reduced the effective size of the search index as we limited searches only to one partition.

5. Defined a keep pool equal to the size of the documents table & configured the documents table to be kept in the keep pool.
Most of the previous optimizations helped in greatly reducing the amount of I/O by reducing the size of the table as well as by breaking the table & search index into multiple partitions. Now, since we have much less amount of working data, if we can cache all of it in memory then the performance can improve significantly.
This was achieved by defining a keep pool big enough to store the latest partitions of the table & the index. More details on how this can be done as well as how the index can be preloaded can be found in the following article.

The results were impressive. We did a load test with twice the production load i.e. 24 search requests per minute. Before the optimizations the system crashed within 500 search requests and the successful requests had an average response time of 5 minutes. After the final optimizations; searches completed with an average response time of 350 ms.