Introduction
An IBM Integration Bus node will use a varying amount of virtual and
real memory. Observed sizes for the amount of real memory required for
an Integration Server vary from around a few hundred megabytes to many
gigabytes. Exactly how much is used is dependent on a number of
factors. Key items are
- Integration flow coding – includes complexity and coding style
- Messages being processed – size, type and structure of the messages
- DFDL schema /Message sets deployed to the Integration Server
- Level of use of Java within the Integration flows and setting of the JVM heap
- Number of Integration flows that are deployed to the Integration Server
- Number of Integration Servers configured for the Integration Node.
This article contains some practical recommendations to follow in order
to reduce memory usage, often with dramatic results. They are split in
to two sections: Integration Flow Development Recommendations and
Configuration Recommendations. The
article is written with IBM Integration as the focus. However all of
the techniques and comments equally apply to WebSphere Message Broker.
Integration Flow Development Recommendations
Introduction
At the core of all processing within an Integration flow is the message
tree. A message tree is a structure that is created, either by one or
more parsers when an input message bit stream is received by an
Integration flow, or by the action of an Integration flow node. A new
message tree is created for every Integration flow invocation. It is
cleared down at Integration flow termination.
The tree representation of a message is typically bigger than the input
message, which is received as a bitstream into the Integration flow.
The raw input message is of very limited value when it is in a bitstream
format and so parsing into an in memory structure is an essential step
in order to make subsequent processing easier to specify whether that be
using a programming language, such as ESQL, Java or .Net , or a mapping
tool like the Graphical Data Mapper.
The shape and contents of a message tree will change over the course of
execution of an Integration flow as the logic within the Integration
flows executes.
The size of a message tree can vary hugely and it is directly
proportional to the size of the messages being processed and the logic
that is coded within the Integration flow. So both factors need to be
considered. That is how messages are processed, including parsing, and
how they are processed within the Integration flow – the Integration
flow logic.
Message Processing Considerations
When a message is small, such as a few kilobytes in size then the
overhead of fully parsing that message is not that great. However when a
message is 10’s KB or larger then the cost of fully parsing it becomes
larger. When the size grows to megabytes then it becomes even more
important to keep memory usage to a minimum. There are a number of
techniques that can be used to keep memory usage to a minimum and these
are described below.
Parsing of a message will always commence from the start of the message
and proceed as far along the message as required in order to access the
element that is being referred to in the flow though the processing
logic (ESQL, Java, XPath or Graphical Data Mapper map etc.). Dependent
on the field being accessed then it may not be necessary to parse the
whole message. Only that portion of the message that has been parsed
will be populated in the message tree. The rest will be held as an
unprocessed bitstream. It may by parsed subsequently in the flow if
there is logic that requires it. Or it may never be parsed if it is not
required as part of the Integration flow processing.
If you know the Integration flow needs to process the whole of the
message in an Integration flow then is most efficient to parse the whole
message on first reference to a field within the message. To ensure a
full parse specify Parse Immediate or Parse Complete on the input node. Note the default option is Parse on Demand.
Always use a compact parser where possible. XMLNSC for XML and DFDL for
non-XML data are both compact parsers. The XML and XMLNS parsers are
not compact parsers. The MRM is also a compact parser but this has now
been superseded by DFDL. The benefit of compact parsers is that they
discard white space and comments in a message and so those portions of
the input message are not populated in the message tree, so keeping
memory usage down.
For XML messages you can also consider using the opaque parsing
technique. This technique allows named subtrees of the incoming message
to be minimally processed. What this means is that they will be checked
for XML completeness but the sub tree will not be fully expanded and
populated into the message tree. Only the bit stream for the subtree
will appear in the message tree. This technique reduces memory usage.
However when using this technique you should not refer to any of the
contents of the subtree that has been opaquely parsed.
If you are designing messages to be used with applications that route
or process only a portion of the message then place the data that those
applications require at the front of the message. This means less of the
message needs to be parsed and populated into the message tree.
When the size of the messages is large, that is 100’s K upwards then
use large message processing techniques where possible. There are a
couple of techniques that can be used to minimize memory usage but the
success of them will depend on the message structure.
Messages which have repeating structures and where the whole message
does not need to be processed together lend themselves very well to the
use of these techniques. Messages which are megabytes or gigabytes in
size and which need to be treated as a single XML structure for example
are problematic as the whole structure needs to be populated in the
message tree and there is typically much less capacity to optimize
processing.
The techniques are
- Use the DELETE statement in ESQL to delete already processed portions of the incoming message
- Use Parsed Record Sequence for record detection with stream based processing such as with files (FileInput, FTEInput, CDInput) and TCPIP processing (TCPIPClientInput, TCPIPServerInput) where the records are not fixed length and there is no simple record delimiter.
A summary of these techniques is provided here to give you an idea of
what they consist of but for the full details you should consult the
links to the IBM Integration Bus Knowledge Centre that are given below.
DELETE Statement
Use of this technique requires that the message contains a repeating
structure where each repetition or record can be processed individually
or as a small group. This allows the broker to perform limited parsing
of the incoming message at any point in time. In time the whole message
will be processed but not all at once and that is the key to the success
of the technique.
The technique is based on the use of partial parsing and the ability to
parse specified parts of the message tree from the corresponding part
of the bit stream.
The key steps in the processing are:
- A modifiable copy of the input message is made but not parsed (Note InputRoot is not modifiable). As the copy of the input message is not parsed it takes less memory then it would if it were parsed and populated into the message tree.
- A loop and reference variable are then used to process the message one record at a time.
- For each record the contents are processed and a corresponding output tree is produce in a second special folder.
- The ASBISTREAM function is used to generate a bit stream for the output subtree. This is held in a Bitstream element in a position that corresponds with to its position in the final message.
- The DELETE statement is used to delete both the current input and output record message trees when the processing for them has been completed.
- When all of the records in the message have been processed the special holders used to process the input and output streams are detached so that they do not appear in the final message.
If needed then information can be retained from one record to another through the use of variables to save state or values.
For more detail including a worked example see http ://w ww.i bm.c om/s uppo rt/k nowl edge cent er/a pi/c onte nt/S SMKH H_9. 0.0/ com. ibm. etoo ls.m ft.d oc/a c671 76_. ht m
Parsed Record Sequence
This technique uses the parser to split incoming data from non message
based input sources such as the File and TCPIP nodes into messages or
records that can be processed individually. These messages are smaller
in size than the whole file or record. This again allows overall memory
usage to be reduced. Substantially in some cases. It allows very large
files that are Gigabytes in size, to be processed without requiring
Gigabytes of memory.
The technique requires the parser to be one of the XMLNSC, DFDL or
MRM(CWF or TDS) parsers. It will not work with the other parsers.
In order to use this technique the Record Detection property on the input node needs to be set to Parsed Record Sequence.
With this technique the Input node will use the parser to determine the
end of a logical record. In this situation it typically cannot be not
be determined simply by length or a simple delimiter like .
When a logical record has been detected by the parser then it will be
propagated through the Integration flow for processing in the usual way.
For more information on this technique see http ://w ww.i bm.c om/s uppo rt/k nowl edge cent er/a pi/c onte nt/S SMKH H_9. 0.0/ com. ibm. etoo ls.m ft.d oc/a c554 25_. ht m
Coding Recommendations
This section contains some specific ESQL and node usage coding
recommendations that will help to reduce memory usage during processing.
Message Tree Copying
· Minimize the number of times that a tree copy is done. This is the same consideration for any of the transformation nodes.
In ESQL this is usually coded as
SET OutputRoot = InputRoot;
In Java it would be
MbMessage inMessage = inAs semb ly.g etMe ssag e() ;
MbMessage outMessage = new MbMessage(); // create an empty output message
MbMessageAssembly outAssembly = new MbMe ssag eAss embl y(in Asse mbly , outMessage);
In an Integration flow combine adjacent ESQL Compute nodes. For example:
In this example the two ESQL Compute nodes cannot be further combined
into a single ESQL Compute node as there is an external call to SAP with
the SAPRequest node.
Watch for this same issue of multiple adjacent compute nodes when using
subflows. An inefficient Subflow can cause many problems with a flow
and across flows. If an efficient Subflow is used repeatedly in an
application this can replicate the inefficiency many times within the
same Integration flow or group of Integration flows..
It is not so easy to combine an adjacent ESQL Compute and a Java
Compute node. Often the Java Compute node will contain code that cannot
be implemented directly in ESQL. It may be possible to do it the other
way and implement the ESQL code as Java in the Java Compute node. An
alternative approach would be to invoke the Java methods through a
Procedure in ESQL if you are looking to combine into a Compute node.
Consider using the Environment correlation name to hold data rather
than using InputRoot and OutputRoot or LocalEnvironment in each
Compute/Java Compute node.
This way a single copy of the data can be held. You should be aware
that there are recovery issues associated with this however. When using
the traditional approach of SET Outp utRo ot=I nput Root ;
to copy message data in the course of an Integration flow the message
tree is copied at each point at which this statement is coded. Should
there be a failure condition and processing is rolled back within the
Integration flow so will be the message tree. When data is held in
Environment there is a single copy of the data which will not be backed
out in the event of a failure condition. The application code in the
Integration flow needs to be aware of this difference in behaviour and
allow for it.
Rese tCon tent Desc ript or Node
· Where there is a sequence of Compute node -> Rese tCon tent Desc ript or
node -> Compute node this could be replaced by one compute node that
made use of the ESQL ASBITSTREAM and CREATE with PARSE to replace the
RCD node.
· A common node usage pattern is a combination of Filter nodes and Rese tCon tent Desc ript or
nodes to determine which owning parser to assign. This can be optimised
using a single Compute node with IF statements and CREATE with PARSE
statements.
Trace nodes
· Be
aware of the use of trace nodes in non development environments.
References to ${Root} will cause a full parse of the message if
the trace node is active. Hopefully any such processing is not on the
critical path of the Integration flow and it would only be executed for
exception processing.
Node Level Loops
· Although
loops for processing at the node level are permitted within the IBM
Integration Bus programming model do be careful about which situations
they are used in or there can be a rapid growth of memory usage.
For example consider a schematic of an Integration flow which is
intended to read all of the files in a directory and process them.
As each file is read and processed and processing completes then the End of File Terminal is driven and the next file is read to be processed.
This is indeed what is needed but the implementation is such that all
of the state associated with the node (message tree, parsers, variables
etc.) is placed on the stack and heap as the node is repeatedly
executed. In one particular case where several hundred files were being
read in the same Integration flow execution the Integration Server rose
in size to be around 20 GB of memory.
Using this alternate design the
required memory size was substantially less with a size of 2 GB compared
with the previous 20 GB.
This technique uses the Environment to pass state from the File Read
node, that is whether the file has been completely read, back to the
Compute node Sett ing_ File Read _Pro pert ies, so that another PROPAGATE to the File Read node can take place for the next file.
Optimise ESQL Code
· Write
efficient ESQL which reduces the number of statements required. This
will increase the efficiency of the Integration flow by reducing the
number of ESQL objects which are parsed and created in the
DataFlowEngine. A simple example of this is to initialize variables on a
DECLARE instead of following the DECLARE with a SET. Use the ROW and
LIST constructors to create lists of fields instead of one at a time.
Use the SELECT function to perform message transformations instead of
mapping individual fields.
· When
using the SQL SELECT statement, use WHERE clauses efficiently to help
minimize the amount of data retrieved from the database operation. It is
better to process a small result set than have to filter and reduce it
within in the Integration flow
· Use the
PROPAGATE verb where possible. For example if multiple output messages
are being written from a single input message, then consider using the
ESQL PROPAGATE function to source the loop. With each iteration of the
propagation, the storage from the output tree is reclaimed and re-used,
thus reducing the storage usage of the Integration flow.
Configuration Recommendations
Introduction
Whilst the major influence on the amount of memory used in processing
data is the combination of the messages being processed and the way in
which the flow is coded the configuration of the broker runtime can also
have a significant effect and it should not be ignored. This section
will discuss the key considerations.
The Integration flow logic and indeed product code with an Integration
Server all use virtual memory. The message tree, variables, input and
output messages are all held at locations within virtual memory as they
are processed. Virtual memory can be held in central, or real, memory.
It can also be held in auxiliary memory or swap space if it is paged or
swapped out. It can not be used in processing at this time though. Where
a page of virtual memory sits exactly is determined by the operating
system and not by IBM Integration Bus.
Note that Integration Servers which are configured differently and
which have different Integration flows deployed to them will use
different amounts of virtual and real memory. There is not one amount of
virtual and real memory for every Integration Server.
For processing to take place a subset of the virtual memory allocated
to an Integration Server will need to be in real memory so that data can
be processed (read and/or updated) with ESQL, Java or Graphical Data
Mapping nodes etc. This set of essential memory is what forms the real
memory requirement, or working set, of the Integration Server. The
contents of the working set will change over time.
For processing to continue smoothly all of the working set needs to be
in memory. If required pages need to be moved into real memory then this
causes delays in processing. In a heavily loaded system the operating
system has to manage the movement of pages very quickly. When the demand
for real memory is much higher than that which is actually available
then processing for one one more components can be slowed down and in
extreme cases systems can end up thrashing and performing no useful
work. It is important therefore to ensure that there is sufficient real
memory available to allow processing to run smoothly. The first part of
this is to make sure that all processing is optimized which is what the
Integration flow coding recommendations are intended to help with.
The real memory requirements of all of the Integration Servers that are
running plus that required by the Bipservice Bipbroker and
BipHTTPListener processes form the total real memory requirement of the
Integration Node. As Integration flows and/or Integration Servers are
started and stopped this real memory requirement will change. For a
given configuration and fixed workload then the overall real memory
requirement should remain within a narrow band of usage with only slight
fluctuations. If the real memory requirement continues to grow there is
most likely a memory leak. This could be in the Integration logic or
in some rare cases the IBM Integration Bus product.
Now that IBM Integration uses 64 bit addressing there are no virtual
memory constraints for the deployment of Integration flows to
Integration Servers in the way that there were when 32-bit addressing
was the only option. This means that Integration flows can be freely
deployed to Integration Servers as required.
The memory requirement for an integration node will be entirely
dependent on the processing that needs to be performed. If all
Integration Servers were stopped the real memory requirements would be
very small. If 50 Integration Servers were active and processing using
inefficient Integration flows to process large messages or large files
then the real memory requirement could be 10’s of Gigabytes in size for
example. Again this is no fixed memory requirement for an Integration
node. It will very much depend on which integration flows are running
and the configuration of the Integration node.
Broker Configuration
The following configuration factors affect the memory usage of an integration server.
- The deployment of Integration flows to an Integration Server
- The size of the JVM for each Integration Server
- The number of Integration Servers
We will now look at each of these in more detail.
Deployment of Messages Flows to Integration Servers
Each Integration flow has a virtual memory requirement that is dependent on the rout ing/ tran sfor mati on logic that is executed and the messages to be processed.
Using additional instances for a message will result in additional
memory usage but not by so much as deploying another the same flow to
another Integration Server.
The more, different, Integration flows that are deployed to an
Integration Server then the more virtual and real memory will be used by
that Integration Server.
There will be an initial requirement of memory for deployment and a
subsequent higher requirement once messages have been processed.
Different Integration flows will use different amounts of additional
memory. It is not possible to accurately predict the amount of virtual
and real memory that will be needed for any Integration flow or message
to be processed. So for planning purposes it is best to run and measure
the requirement once the Integration flow has processed messages. After
some minutes of processing messages memory usage should stablise
providing the mix of messages is not constantly changing, such as the
size continually increasing.
If multiple messages have been deployed to the Integration Server then
ensure all process messages before observing the peak memory usage of
the Integration Server.
When looking at the memory usage of an Integration Server then focus on
the real memory usage. That is the RSS value on Unix systems or Working
Set in Windows. This is the key to understanding how much memory is
being used by a process at a point in time. There will be other pages
that may be sitting in swap space that were used at some point in the
processing possibly at start-up or when a different part of the message
flow was executed for example. But due to the demand for memory they may
well no longer be in real memory.
To understand how much memory processes on the system are using then use the following:
-
AIX - The command ps -e -o vsz=
,rss =,co mm= will display virtual and real memory usage along with the command for that process - Linux: - The command ps -e -o vsz,rss,cmd will display virtual and real memory usage along with the command for that process
- Windows - Run Task Manager then select View -> Select Columns -> Memory (Private Working Set)
Integration Server JVM Settings
In IBM Integration Bus V9 the default setting of the JVM is a minimum
setting of 32 MB and a maximum of 256 MB. For most situations these
settings are sufficient. The same settings are true with WebSphere
Message Broker V8.
The amount of Java heap required is dependent on the Integration flows
and in particular the amount of Java. This includes nodes which are
written in Java such as the FileInput, FileOutput, SAPInput, SAPRequest
nodes etc. Given this then different Integration Servers may well
require different Java heap settings so do not expect to always use the
same values for every Integration Server.
A larger Java heap maybe needed if there is a heavy requirement from
the Integration flow logic or nodes which are used within it. The best
way to determine if there is sufficient Java heap is to look at Resource
Statistics for the Integration Server to observe the level of Garbage
Collection.
For batch processing low GC overhead is the goal. Low would be of the order of 1%.
For real–time time processings then low pause times are the goal. Low in this context is less than 1 second.
As a guide a GC overhead of 5-10% can be tuned. As can pause time of 1-2 seconds.
The Integration Server JVM heap settings can be changed with the mqsi chan gepr oper ties command. For example the command:
mqsi chan gepr oper ties PERFBRKR -o ComIbmJVMManager -e IN_OUT -n jvmMaxHeapSize -v 536870912
will increase the maximum JVM heap to a value of 536870912 bytes (512
MB) for the Integration Server IN_OUT in the Integration node PERFBRKR.
Numbers of Integration Servers
A variable number of Integration Servers are supported for an
Integration node. No maximum value is specified by the product.
Practical limits, particularly the amount of real memory or swap space
will limit the number than can be used on a particular machine.
Typically large systems might have 10’s of Integration Servers. A
broker with 100 Integration Servers would be very large and at point we
would certainly recommend reviewing the policy used to create
Integration Servers and assign Integration flows to them.
The number of Integration Servers in use does not impact the amount of
virtual memory or real memory used by an individual Integration Server
but it does have a very direct effect on the amount of real memory that
is required by the Integration node as a whole and so for this reason it
is good to think about how many Integration Servers are really
required.
An Integration Server is able to host many Integration flows. One
Integration Server could host hundreds of flows if needed. Think
carefully before doing this though as if the Integration Server fails a
high number of applications will be lost for a time and the restart time
for the Integration Server will be significantly increased over one
with a few Integration flows deployed. If the Integration Server
contains Integration flows that are seldom used then this is much less
of an issue.
It is a good idea to have a policy controlling the assignment of
Integration flows to Integration Servers. There are a number of schemes
commonly in use. Some examples are
- Only have Integration flows for one business application assigned to an individual Integration Server. [Large business applications may require multiple Integration Servers dependent on the number of messages flows for the business area].
- Assigning flows that have the same hours of availability together. It is no good assigning flows for the on-line day and for the batch at night into the same Integration Server as it will not be possible to shutdown the Integration Server without interrupting the service of one set of flows.
-
Assign flows for high profile of high priority appl
icat ions /ser vice s to their own Integration Servers and have general utility Integration Servers which host many less important flows. - Assign Integration flows across Executions so that volume
In a multi-node deployment avoid the tendency to deploy all integration
flows to all nodes. Have some nodes provide processing for certain
applications only. Otherwise with everything deployed everywhere real
memory usage will be large.
One additional point. Assign Integration flows that are known to be
heavy on memory usage to the minimum number of Integration Servers. This
will reduce the number of Integration Servers that become large in size
and require large amounts of real memory.
Whichever policy you arrive at think ahead into the future and ensure
it will still work if there are hundreds of more services added.
Sometimes people assign one flow or service to one Integration Server
when there are only a few services in existence. This is something that
you can live with in the early days. But by the time there are another
400 services it simply does not scale and the demand for real memory
usage becomes an issue.