I Came,I Learned,I Experienced....: 06 April 2014

Perform Proper Load Testing

Properly load testing your application is the most critical thing you can do to ensure a rock solid runtime in production.
Replicating your production environment isn’t always 100% necessary as most times you can get the same bang for your buck with a single representative machine in the environment

Calculate expected load across the cluster and divide down to single machine load
Drive load and perform the usual tuning loop to resolve the parameter set you need to tweak and tune.
Look at load on the database system, network, etc and extrapolate if it will support the full systems load and if not of if there are questions test

Performance testing needs to be representative of patterns that your application will actually be executing
Proper performance testing keeps track of and records key system level metrics as well as throughput metrics for reference later when changes to hardware or application are needed.
Always over stress your system. Push the hardware and software to the max and find the breaking points.
Only once you have done real world performance testing can you accurately size the complete set of hardware required to execute your application to meet your demand.

Correctly Tune The JVM

Correctly tuning the JVM in most cases will get you nearly 80% of the possible max performance of your application
The big area to focus on for JVM tuning is heap size

Monitor verbose:gc and target GCing no more than once every 10 seconds with a max GC pause of a second or less.
Incremental testing is required to get this area right running with expected customer load on the system
Only after you have the above boundary layers met for GC do you want to start to experiment with differing garbage collection policies

Beyond the Heap Size settings most other parameters are to extract out max possible performance OR ensure that the JVM cooperates nicely on the system it is running on with other JVMs
The Garbage Collector Memory Visualizer is an excellent tool tool for diagnosing GC issues or refining JVM performance tuning.

Provided as a downloadable plug-in within the IBM Support Assistant

Garbage Collection Memory Visualizer (GCMV)

Ensure Uniform Configuration Across Like Servers

Uniform configuration of software parameters and even operating systems is a common stumbling block
Most times manifests itself as a single machine or process that is burning more CPU, Memory or garbage collecting more frequently
Easiest way to manage this is to have a “dump configuration” script that runs periodically
Store the scripts results off and after each configuration change or application upgrade track differences
Leverage the Visual Configuration Explorer (VCE) tool available within ISA

Visual Configuration Explorer (VCE)

Create Cells To Group Like Applications

Create Cells and Clusters of application servers with an express purpose that groups them in some manner
Large Cells (400-500-1000 members) for the most part while supported don’t make sense
Group applications that need to replicate data to each other or talk to each other via RMI, etc and create cells and clusters around those commonalities.
Keeping cell size smaller leads to more efficient resource utilization due to less network traffic for configuration changes, DRS, HAManager, etc.

For example, core groups should be limited to no more than 40 to 50 instances

Smaller cells and logic grouping make migration forward to newer versions of products easier and more compartmentalized.

Tune JDBC Data Sources

Correct database connection pool tuning can yield significant gains in performance
This pool is highly contended in heavily multithreaded applications so ensuring significant available connections are in the pool leads to superior performance.
Monitor PMI metrics via TPV or others tools to watch for threads waiting on connections to the database as well as their wait time.

If threads are waiting increase the number of pooled connections in conjunction with your DBA OR decrease the number of active threads in the system
In some cases, a one-to-one mapping between DB connections and threads may be ideal

Frequently database deadlocks or bottlenecks first manifest themselves as a large number of threads from your thread pool waiting for connections
Always use the latest database driver for the database you are running as performance optimization in this space between versions are significant
Tune the Prepared Statement Cache Size for each JDBC data source

Can also be monitored via PMI/TPV to determine ideal value

Correctly Tune Thread Pools

Thread pools and their corresponding threads control all execution on the hardware threads.
Understand which thread pools your application uses and size all of them appropriately based on utilization you see in tuning exercises

Thread dumps, PMI metrics, etc will give you this data
Thread Dump Memory Analyzer and Tivoli Performance viewer (TPV) will help in viewing this data.

Think of the thread pool as a queuing mechanism to throttle how many active requests you will have running at any one time in your application.

Apply the funnel based approach to sizing these pools

Example IHS (1000) -> WAS ( 50) -> WAS DB connection pool (30) ->
Thread numbers above vary based on application characteristics

Since you can throttle active threads you can control concurrency through your codebase

Thread pools needs to be sized with the total number of hardware processor cores in mind

If sharing a hardware system with other WAS instances thread pools have to be tuned with that in mind.
You need to more than likely cut back on the number of threads active in the system to ensure good performance for all applications due to context switching at OS layer for each thread in the system
Sizing or restricting the max number of threads a application can have can sometimes be used to prevent rouge applications for impacting others.

Default sizes for WAS thread pools on v6.1 and above are actually a little to high for best performance

Two to one ratio (threads to cores) typically yields the best performance but this varies drastically between applications and access patterns

TPV & TDMA tool snapshots

Minimize HTTP Session Content

High performance data replication for application availability depends on correctly sized session data

Keep it under 1MB in all cases if possible

Only should be storing information critical to that users specific interaction with the server
If composite data is required build it progressively as the interaction occurs

Configure Session Replication in WAS to meet your needs
Use different configuration options (async vs. synch) to give you the availability your application needs without compromising response time.
Select the replication topology that works best for you (DB, M2M, M2M Server)
Keep replication domains small and/or partition where possible

Understand and Tune Infrastructure (databases & other interactive server systems)

WebSphere Application Server and the system it runs on is typically only one part of the datacenter infrastructure and it has a good deal of reliance on other areas performing properly.Think of your infrastructure as a plumbing system. Optimal drain performance only occurs when no pipes are clogged.
On the WAS system itself you need to be vary aware of

What other WAS instances (JVMs) are doing and their CPU / IO profiles
How much memory other WAS instance (or other OS’s in a virtualized case) are using
Network utilization of other applications coexisting on the same hardware

In the supporting infrastructure

Varying Network Latency can drastically effect split cell topologies, cross site data replication and database query latency

Ensure network infrastructure is repeatable and robust
Don’t take for granted bandwidth or latency before going into production always test as labs vary

Firewalls can cause issues with data transfer latencies between systems

On the database system

Ensure that proper indexes and tuning is done for the applications request patterns
Ensure that the database supports the number of connected clients your WAS runtime will have
Understand the CPU load and impacts of other applications (batch, OLTP, etc all competing with your applications)

On other application server systems or interactive server systems

Ensure performance of connected applications is up for the load being requested of it by the WAS system
Verify that developers have coded specific handling mechanisms for when connected applications go down (You need to avoid storm drain scenarios)

Keep Application Logging to a Minimum

Never should there be information outside of error cases being written to SystemOut.log
If using logging build your log messages only when needed

Good

if(loggingEnabled==true){ errorMsg = “This is a bad error” + “ “ + failingObject.printError();}

Bad

errorMsg = “This is a bad error” + “ “ + failingObject.printError();
If(loggingEnabled==true){ System.out.println(errorMsg); }

Keep error and log messages to the point and easy to debug
If using Apache Commons, Log4J, or other frameworks ensure performance on your system is as expected
Ensure if you must log information for audit purposes or other reasons that you are writing to a fast disk

Properly Tune the Operating System

Operating System is consistently overlooked for functional tuning as well as performance tuning.
Understand the hardware infrastructure backing your OS. Processor counts, speed, shared/unshared, etc
ulimit values need to be set correctly. Main player here is the number of open file handles (ulimit –n). Other process size and memory ones may need to be set based on application
Make sure NICs are set to full duplex and correct speeds
Large pages need to be enabled to take advantage of –Xlp JDK parametes
If enabled by default check RAS settings on OS and tune them down
Configure TCP/IP timeouts correctly for your applications needs
Depending on the load being placed on the system look into advanced tuning techniques such as pinning WAS processes via RSET or TASKSET as well as pinning IRQ interrupts

WAS Throughput with processor pinning

I Came,I Learned,I Experienced....

Thursday, April 10, 2014

WebSphere Application Server Performance Tuning Recommendations