Introduction
The event manager is the central component in IBM Business Process
Manager that is responsible for scheduling a number of different tasks.
If it's not working correctly, you might run into severe problems that
need to be resolved quickly.
This blog entry describes some of the most common symptoms and shows how to resolve them.
In
Part I, some common event manager problems will be shown.
Part II explains how to analyze and fix these problems.
Part III lists the available APARs that are related to the event manager.
Special thx to Mark Filley and Bill Wentworth for their technical input and in depth review!
Part I - Common event manager symptoms
Symptom A - Event manager is not processing any work
The event manger is responsible for the scheduling of various jobs like:
-
Executing undercover agents (UCAs)
-
Executing system lane tasks
-
Triggering business process definition (BPD) timers
-
Scheduling BPD notifications, which are essential to move the process flow forward through the business process diagram (BPMN)
If your process instances are stuck, timers not fired, and UCAs are no
longer executed, then the event manager is not running or could be
blocked.
The Process Admin Console gives you a comprehensive view that shows the status of the event manager. When the status light is red, the event manager is paused or did not start and its jobs are accumulating with a Scheduled Time much behind the current time. In the Process Admin Console, these jobs would show 'Job Status' as 'Scheduled' and jobs would not be in the 'Executing' state.
For example:
The following screen shot, which was taken from the Process Admin
Console, shows the last event manager heart beat expiration time stamp
of "12/10/2014 2:00:15 PM." This time stamp is normally ahead of the
current time. The event manager job's (UCAs and BPD notifications) Scheduled Time
shows an earlier time stamp and a job is not currently executing. In
this example, the event manager is shown as inactive (red light), which
explains the situation.
Note: Even if the event manager
is not running, it is possible to start new process instances, but they
will not move forward! As services are not scheduled by the event
manager, those could also be executed.
Symptom B - Event manager shows jobs with a scheduled date of 2099
The Process Admin Console can show event manager jobs scheduled for 2099 as shown here:
Symptom C - Event manager is active, but long running system lane tasks block the event manager throughput
There can be situations where the event manager is actively working,
but you experience throughput problems. For example, the flow in the
process instances is not moving forward or the execution of timers is
delayed.
The following screen shot shows five system lane activities being
executed, but a couple of BPD notifications are waiting to be executed.
These BPD notifications are overdue as the 'Scheduled Time' is greater
than the current time. This situation can indicate that the event
manager configuration needs to be tuned and/or the execution time for
system lane tasks needs to be optimized, if possible.
Symptom D - UCAs are not processing at the desired rate
According to the definition in the process application, UCAs are bound
to a couple of synchronous queues or a single asynchronous queue managed
by the event manager. The capacity for these queues is defined by the
following parameters in the
80EventManager.xml configuration file:
nc-queue-capacity> or
c-queue-capacity>
Symptom E - Many BPD timers wake up at the same time
When the event manager processes a timer, it loads the applicable task
into the "BPD async queue," whose capacity is defined by the -queue-capacity> setting from the 80EventManager.xml
configuration file. If the application design has hundreds or thousands
of timers that start at the exact same time, then this setting might
need to be increased beyond the default of forty (40).
Symptom F - Event Manager warning messages CWLLG2156W, CWLLG2236W occur
If the BPM run time detects that the database connection pool is too
small, it will dynamically reduce the queue sizes and you will see
entries in SystemOut.log like the following messages:
"CWLLG2156W: The database connection pool size xxx of the Process Server data source might be too small." and/or
"CWLLG2236W: The configured <%%%%%%-queue-capacity> parameter of xxx has been changed to yyy."
These messages indicate that there is a mismatch between the event manager queue capacity and the JDBC data source pool size.
Symptom G - Event manager tasks fail when LombardiEventEmitterInputQueue reached max threshold
When you have your IBM Business Process Manager environment configured
to forward monitoring events to a Business Monitor server, the execution
of event manager tasks involves sending a message to the local queue
called "LombardiEventEmitterInputQueue." This queue maps to the JNDI name jms/com.ibm.lombardi/EventEmissionQueue.
If the queue depth of the LombardiEventEmitterInputQueue reaches
the configured maximum threshold, no more message can be put to this
queue and the execution of an event manager tasks will end up in an
exception like the following text:
J2CA0027E: An exception occurred while invoking prepare on an XA Resource Adapter from DataSource jms/com.ibm.lombardi/EventEmissionQueueFactory, within transaction ID {XidImpl: formatId(57415344), gtrid_length(36), bqual_length(54),
data(0000014ac680b3dd000000010c3c5a4c30653f6b06f16c1e5782cea7f4fce4b60a8f48d30000014ac680b3dd000000010c3c5a4c30653f6b06f16c1e5782cea7f4fce4b60a8f48d3000000010000000000000000000000000002)} : javax.transaction.xa.XAException:
CWSIC8007E: An exception was caught from the remote server with Probe
Id 3-013-0010. Exception: CWSIC2029E: This transaction cannot commit as
an operation that was performed within the transaction boundary failed.
The first operation that failed generated the following exception: com.ibm.ws.sib.processor.exceptions.SIMPLimitExceededException: CWSIK0025E: The destination LombardiEventEmitterInputQueue on messaging engine ProcessServerProdDepEnv.SupCluster.000-MONITOR.Cell01.Bus is not available because the high limit for the number of messages for this destination has already been reached...
at com.ibm.ws.sib.comms.common.CommsByteBuffer.parseSingleException(CommsByteBuffer.java:1753)
at com.ibm.ws.sib.comms.common.CommsByteBuffer.getException(CommsByteBuffer.java)
at com.ibm.ws.sib.comms.common.CommsByteBuffer.checkXACommandCompletionStatus(CommsByteBuffer.java:1218)
at com.ibm.ws.sib.comms.client.OptimizedSIXAResourceProxy.prepare(OptimizedSIXAResourceProxy.java:749)
at com.ibm.ws.sib.comms.client.SuspendableXAResource.prepare(SuspendableXAResource.java:386)
at com.ibm.ws.sib.api.jmsra.impl.JmsJcaRecoverableSiXaResource.prep