Introduction
 An IBM Integration Bus node will use a varying amount of virtual and 
real memory. Observed sizes for the amount of real memory required for 
an Integration Server vary from around a few hundred megabytes to many 
gigabytes. Exactly how much is used is dependent on a number of 
factors.  Key items are
- Integration flow coding – includes complexity and coding style
- Messages being processed – size, type and structure of the messages
- DFDL schema /Message sets deployed to the Integration Server
- Level of use of Java within the Integration flows and setting of the JVM heap
- Number of Integration flows that are deployed to the Integration Server
- Number of Integration Servers configured for the Integration Node.
 This article contains some practical recommendations to follow in order
 to reduce memory usage, often with dramatic results.  They are split in
 to two sections: Integration Flow Development Recommendations and 
Configuration Recommendations.  The
 article is written with IBM Integration as the focus. However all of 
the techniques and  comments equally apply to WebSphere Message Broker.
Integration Flow Development Recommendations
Introduction
 At the core of all processing within an Integration flow is the message
 tree. A message tree is a structure that is created, either by one or 
more parsers when an input message bit stream is received by an 
Integration flow, or by the action of an Integration flow node.  A new 
message tree is created for every Integration flow invocation. It is 
cleared down at Integration flow termination.
 The tree representation of a message is typically bigger than the input
 message, which is received as a bitstream into the Integration flow.  
The raw input message is of very limited value when it is in a bitstream
 format and so parsing into an in memory structure is an essential step 
in order to make subsequent processing easier to specify whether that be
 using a programming language, such as ESQL, Java or .Net , or a mapping
 tool like the Graphical Data Mapper.
 The shape and contents of a message tree will change over the course of
 execution of an Integration flow as the logic within the Integration 
flows executes. 
 The size of a message tree can vary hugely and it is directly 
proportional to the size of the messages being processed and the logic 
that is coded within the Integration flow.  So both factors need to be 
considered. That is how messages are processed, including parsing, and 
how they are processed within the Integration flow – the Integration 
flow logic.
Message Processing Considerations
 When a message is small, such as a few kilobytes in size then the 
overhead of fully parsing that message is not that great. However when a
 message is 10’s KB or larger then the cost of fully parsing it becomes 
larger.  When the size grows to megabytes then it becomes even more 
important to keep memory usage to a minimum.  There are a number of 
techniques that can be used to keep memory usage to a minimum and these 
are described below.
 Parsing of a message will always commence from the start of the message
 and proceed as far along the message as required in order to access the
 element that is being referred to in the flow though the processing 
logic (ESQL, Java, XPath or Graphical Data Mapper map etc.). Dependent 
on the field being accessed then it may not be necessary to parse the 
whole message.  Only that portion of the message that has been parsed 
will be populated in the message tree. The rest will be held as an 
unprocessed bitstream. It may by parsed subsequently in the flow if 
there is logic that requires it. Or it may never be parsed if it is not 
required as part of the Integration flow processing.
 If you know the Integration flow needs to process the whole of the 
message in an Integration flow then is most efficient to parse the whole
 message on first reference to a field within the message. To ensure a 
full parse specify Parse Immediate or Parse Complete on the input node. Note the default option is Parse on Demand.
 Always use a compact parser where possible. XMLNSC for XML and DFDL for
 non-XML data are both compact parsers. The XML and XMLNS parsers are 
not compact parsers. The MRM is also a compact parser but this has now 
been superseded by DFDL. The benefit of compact parsers is that they 
discard white space and comments in a message and so those portions of 
the input message are not populated in the message tree, so keeping 
memory usage down.
 For XML messages you can also consider using the opaque parsing 
technique. This technique allows named subtrees of the incoming message 
to be minimally processed. What this means is that they will be checked 
for XML completeness but the sub tree will not be fully expanded and 
populated into the message tree. Only the bit stream for the subtree 
will appear in the message tree. This technique reduces memory usage. 
 However when using this technique you should not refer to any of the 
contents of the subtree that has been opaquely parsed.
 If you are designing messages to be used with applications that route 
or process only a portion of the message then place the data that those 
applications require at the front of the message. This means less of the
 message needs to be parsed and populated into the message tree.
 When the size of the messages is large, that is 100’s K upwards then 
use large message processing techniques where possible.  There are a 
couple of techniques that can be used to minimize memory usage but the 
success of them will depend on the message structure.
 Messages which have repeating structures and where the whole message 
does not need to be processed together lend themselves very well to the 
use of these techniques. Messages which are megabytes or gigabytes in 
size and which need to be treated as a single XML structure for example 
are problematic as the whole structure needs to be populated in the 
message tree and there is typically much less capacity to optimize 
processing.
 The techniques are
- Use the DELETE statement in ESQL to delete already processed portions of the incoming message
- Use Parsed Record Sequence for record detection with stream based processing such as with files (FileInput, FTEInput, CDInput) and TCPIP processing (TCPIPClientInput, TCPIPServerInput) where the records are not fixed length and there is no simple record delimiter.
 A summary of these techniques is provided here to give you an idea of 
what they consist of but for the full details you should consult the 
links to the IBM Integration Bus Knowledge Centre that are given below.
 DELETE Statement
 Use of this technique requires that the message contains a repeating 
structure where each repetition or record can be processed individually 
or as a small group.  This allows the broker to perform limited parsing 
of the incoming message at any point in time. In time the whole message 
will be processed but not all at once and that is the key to the success
 of the technique.
 The technique is based on the use of partial parsing and the ability to
 parse specified parts of the message tree from the corresponding part 
of the bit stream.
 The key steps in the processing are:
- A modifiable copy of the input message is made but not parsed (Note InputRoot is not modifiable). As the copy of the input message is not parsed it takes less memory then it would if it were parsed and populated into the message tree.
- A loop and reference variable are then used to process the message one record at a time.
- For each record the contents are processed and a corresponding output tree is produce in a second special folder.
- The ASBISTREAM function is used to generate a bit stream for the output subtree. This is held in a Bitstream element in a position that corresponds with to its position in the final message.
- The DELETE statement is used to delete both the current input and output record message trees when the processing for them has been completed.
- When all of the records in the message have been processed the special holders used to process the input and output streams are detached so that they do not appear in the final message.
 If needed then information can be retained from one record to another through the use of variables to save state or values.
 For more detail including a worked example see http
 Parsed Record Sequence
 This technique uses the parser to split incoming data from non message 
based input sources such as the File and TCPIP nodes into messages or 
records that can be processed individually.  These messages are smaller 
in size than the whole file or record. This again allows overall memory 
usage to be reduced. Substantially in some cases. It allows very large 
files that are Gigabytes in size, to be processed without requiring 
Gigabytes of memory.
 The technique requires the parser to be one of the XMLNSC, DFDL or 
MRM(CWF or TDS) parsers. It will not work with the other parsers.
 In order to use this technique the Record Detection property on the input node needs to be set to Parsed Record Sequence.
 With this technique the Input node will use the parser to determine the
 end of a logical record.  In this situation it typically cannot be not 
be determined simply by length or a simple delimiter like .  
When a logical record has been detected by the parser then it will be 
propagated through the Integration flow for processing in the usual way. 
 For more information on this technique see http
Coding Recommendations
 This section contains some specific ESQL and node usage coding 
recommendations that will help to reduce memory usage during processing.
 Message Tree Copying
 ·      Minimize the number of times that a tree copy is done.  This is the same consideration for any of the transformation nodes.  
 In ESQL this is usually coded as
 SET OutputRoot = InputRoot; 
 In Java it would be
    MbMessage inMessage = inAs
    MbMessage outMessage = new MbMessage();  // create an empty output message
 MbMessageAssembly outAssembly = new MbMe
 In an Integration flow combine adjacent ESQL Compute nodes. For example:
 In this example the two ESQL Compute nodes cannot be further combined 
into a single ESQL Compute node as there is an external call to SAP with
 the SAPRequest node.
 Watch for this same issue of multiple adjacent compute nodes when using
 subflows. An inefficient Subflow can cause many problems with a flow 
and across flows. If an efficient Subflow is used repeatedly in an 
application this can replicate the inefficiency many times within the 
same Integration flow or group of Integration flows..
 It is not so easy to combine an adjacent ESQL Compute and a Java 
Compute node. Often the Java Compute node will contain code that cannot 
be implemented directly in ESQL. It may be possible to do it the other 
way and implement the ESQL code as Java in the Java Compute node. An 
alternative approach would be to invoke the Java methods through a 
Procedure in ESQL if you are looking to combine into a Compute node.
 Consider using the Environment correlation name to hold data rather 
than using InputRoot and OutputRoot or LocalEnvironment in each 
Compute/Java Compute node.
 This way a single copy of the data can be held. You should be aware 
that there are recovery issues associated with this however. When using 
the traditional approach of SET Outp
 Rese
 ·       Where there is a sequence of Compute node -> Rese
 ·       A common node usage pattern is a combination of Filter nodes and Rese
 Trace nodes
 ·       Be 
aware of the use of trace nodes in non development environments. 
References to ${Root} will cause a full parse of the message if 
the trace node is active. Hopefully any such processing is not on the 
critical path of the Integration flow and it would only be executed for 
exception processing.
 Node Level Loops
 ·       Although
 loops for processing at the node level are permitted within the IBM 
Integration Bus programming model do be careful about which situations 
they are used in or there can be a rapid growth of memory usage.
 For example consider a schematic of an Integration flow which is 
intended to read all of the files in a directory and process them. 
 As each file is read and processed and processing completes then the End of File Terminal is driven and the next file is read to be processed.
 This is indeed what is needed but the implementation is such that all 
of the state associated with the node (message tree, parsers, variables 
etc.) is placed on the stack and heap as the node is repeatedly 
executed. In one particular case where several hundred files were being 
read in the same Integration flow execution the Integration Server rose 
in size to be around 20 GB of memory.
 Using this alternate design the 
required memory size was substantially less with a size of 2 GB compared
 with the previous 20 GB.
 This technique uses the Environment to pass state from the File Read 
node, that is whether the file has been completely read, back to the 
Compute node Sett
 Optimise ESQL Code
 ·       Write 
efficient ESQL which reduces the number of statements required. This 
will increase the efficiency of the Integration flow by reducing the 
number of ESQL objects which are parsed and created in the 
DataFlowEngine. A simple example of this is to initialize variables on a
 DECLARE instead of following the DECLARE with a SET. Use the ROW and 
LIST constructors to create lists of fields instead of one at a time. 
Use the SELECT function to perform message transformations instead of 
mapping individual fields.
 ·       When 
using the SQL SELECT statement, use WHERE clauses efficiently to help 
minimize the amount of data retrieved from the database operation. It is
 better to process a small result set than have to filter and reduce it 
within in the Integration flow
 ·      Use the
 PROPAGATE verb where possible. For example if multiple output messages 
are being written from a single input message, then consider using the 
ESQL PROPAGATE function to source the loop. With each iteration of the 
propagation, the storage from the output tree is reclaimed and re-used, 
thus reducing the storage usage of the Integration flow.
Configuration Recommendations
Introduction
 Whilst the major influence on the amount of memory used in processing 
data is the combination of the messages being processed and the way in 
which the flow is coded the configuration of the broker runtime can also
 have a significant effect and it should not be ignored.  This section 
will discuss the key considerations.
 The Integration flow logic and indeed product code with an Integration 
Server all use virtual memory.  The message tree, variables, input and 
output messages are all held at locations within virtual memory as they 
are processed.  Virtual memory can be held in central, or real, memory. 
It can also be held in auxiliary memory or swap space if it is paged or 
swapped out. It can not be used in processing at this time though. Where
 a page of virtual memory sits exactly is determined by the operating 
system and not by IBM Integration Bus.
 Note that Integration Servers which are configured differently and 
which have different Integration flows deployed to them will use 
different amounts of virtual and real memory. There is not one amount of
 virtual and real memory for every Integration Server.
 For processing to take place a subset of the virtual memory allocated 
to an Integration Server will need to be in real memory so that data can
 be processed (read and/or updated) with ESQL, Java or Graphical Data 
Mapping nodes etc.  This set of essential memory is what forms the real 
memory requirement, or working set, of the Integration Server. The 
contents of the working set will change over time.
 For processing to continue smoothly all of the working set needs to be 
in memory. If required pages need to be moved into real memory then this
 causes delays in processing.  In a heavily loaded system the operating 
system has to manage the movement of pages very quickly. When the demand
 for real memory is much higher than that which is actually available 
then processing for one one more components can be slowed down and in 
extreme cases systems can end up thrashing and performing no useful 
work. It is important therefore to ensure that there is sufficient real 
memory available to allow processing to run smoothly.  The first part of
 this is to make sure that all processing is optimized which is what the
 Integration flow coding recommendations are intended to help with.
 The real memory requirements of all of the Integration Servers that are
 running plus that required by the Bipservice Bipbroker and 
BipHTTPListener processes form the total real memory requirement of the 
Integration Node.  As Integration flows and/or Integration Servers are 
started and stopped this real memory requirement will change. For a 
given configuration and fixed workload then the overall real memory 
requirement should remain within a narrow band of usage with only slight
 fluctuations. If the real memory requirement continues to grow there is
 most likely a memory leak.  This could be in the Integration logic or 
in some rare cases the IBM Integration Bus product.
 Now that IBM Integration uses 64 bit addressing there are no virtual 
memory constraints for the deployment of Integration flows to 
Integration Servers in the way that there were when 32-bit addressing 
was the only option.   This means that Integration flows can be freely 
deployed to Integration Servers as required. 
 The memory requirement for an integration node will be entirely 
dependent on the processing that needs to be performed. If all 
Integration Servers were stopped the real memory requirements would be 
very small. If 50 Integration Servers were active and processing using 
inefficient Integration flows to process large messages or  large files 
then the real memory requirement could be 10’s of Gigabytes in size for 
example.  Again this is no fixed memory requirement for an Integration 
node. It will very much depend on which integration flows are running 
and the configuration of the Integration node.
Broker Configuration
 The following configuration factors affect the memory usage of an integration server.
- The deployment of Integration flows to an Integration Server
- The size of the JVM for each Integration Server
- The number of Integration Servers
 We will now look at each of these in more detail.
 Deployment of Messages Flows to Integration Servers
 Each Integration flow has a virtual memory requirement that is dependent on the rout
 Using additional instances for a message will result in additional 
memory usage but not by so much as deploying another the same flow to 
another Integration Server.
 The more, different, Integration flows that are deployed to an 
Integration Server then the more virtual and real memory will be used by
 that Integration Server.
 There will be an initial requirement of memory for deployment and a 
subsequent higher requirement once messages have been processed. 
Different Integration flows will use different amounts of additional 
memory.  It is not possible to accurately predict the amount of virtual 
and real memory that will be needed for any Integration flow or message 
to be processed. So for planning purposes it is best to run and measure 
the requirement once the Integration flow has processed messages. After 
some minutes of processing messages memory usage should stablise 
providing the mix of messages is not constantly changing, such as the 
size continually increasing.
 If multiple messages have been deployed to the Integration Server then 
ensure all process messages before observing the peak memory usage of 
the Integration Server.
 When looking at the memory usage of an Integration Server then focus on
 the real memory usage. That is the RSS value on Unix systems or Working
 Set in Windows. This is the key to understanding how much memory is 
being used by a process at a point in time. There will be other pages 
that may be sitting in swap space that were used at some point in the 
processing possibly at start-up or when a different part of the message 
flow was executed for example. But due to the demand for memory they may
 well no longer be in real memory.
 To understand how much memory processes on the system are using then use the following:
- 
  AIX - The command  ps -e -o vsz=,rss =,co mm= will display virtual and real memory usage along with the command for that process 
- Linux: - The command ps -e -o vsz,rss,cmd will display virtual and real memory usage along with the command for that process
- Windows - Run Task Manager then select View -> Select Columns -> Memory (Private Working Set)
 Integration Server JVM Settings
 In IBM Integration Bus V9 the default setting of the JVM is a minimum 
setting of 32 MB and a maximum of 256 MB. For most situations these 
settings are sufficient.  The same settings are true with WebSphere 
Message Broker V8.
 The amount of Java heap required is dependent on the Integration flows 
and in particular the amount of Java. This includes nodes which are 
written in Java such as the FileInput, FileOutput, SAPInput, SAPRequest 
nodes etc.  Given this then different Integration Servers may well 
require different Java heap settings so do not expect to always use the 
same values for every Integration Server.
 A larger Java heap maybe needed if there is a heavy requirement from 
the Integration flow logic or nodes which are used within it. The best 
way to determine if there is sufficient Java heap is to look at Resource
 Statistics for the Integration Server to observe the level of Garbage 
Collection.
 For batch processing low GC overhead is the goal. Low would be of the order of 1%.
 For real–time time processings then low pause times are the goal. Low in this context is less than 1 second.    
 As a guide a GC overhead of 5-10% can be tuned. As can pause time of 1-2 seconds.
 The Integration Server JVM heap settings can be changed with the mqsi
 mqsi
 will increase the maximum JVM heap to a value of 536870912 bytes (512 
MB) for the Integration Server IN_OUT in the Integration node  PERFBRKR.
 Numbers of Integration Servers
 A variable number of Integration Servers are supported for an 
Integration node. No maximum value is specified by the product.  
Practical limits, particularly the amount of real memory or swap space 
will limit the number than can be used on a particular machine.
 Typically large systems might have 10’s of Integration Servers. A 
broker with 100 Integration Servers would be very large and at point we 
would certainly recommend reviewing the policy used to create 
Integration Servers and assign Integration flows to them.
 The number of Integration Servers in use does not impact the amount of 
virtual memory or real memory used by an individual Integration Server 
but it does have a very direct effect on the amount of real memory that 
is required by the Integration node as a whole and so for this reason it
 is good to think about how many Integration Servers are really 
required.
 An Integration Server is able to host many Integration flows. One 
Integration Server could host hundreds of flows if needed. Think 
carefully before doing this though as if the Integration Server fails a 
high number of applications will be lost for a time and the restart time
 for the Integration Server will be significantly increased over one 
with a few Integration flows deployed. If the Integration Server 
contains Integration flows that are seldom used then this is much less 
of an issue.
 It is a good idea to have a policy controlling the assignment of 
Integration flows to Integration Servers. There are a number of schemes 
commonly in use. Some examples are
- Only have Integration flows for one business application assigned to an individual Integration Server. [Large business applications may require multiple Integration Servers dependent on the number of messages flows for the business area].
- Assigning flows that have the same hours of availability together. It is no good assigning flows for the on-line day and for the batch at night into the same Integration Server as it will not be possible to shutdown the Integration Server without interrupting the service of one set of flows.
- 
  Assign flows for high profile of high priority applicat ions /ser vice s to their own Integration Servers and have general utility Integration Servers which host many less important flows. 
- Assign Integration flows across Executions so that volume
 In a multi-node deployment avoid the tendency to deploy all integration
 flows to all nodes. Have some nodes provide processing for certain 
applications only.  Otherwise with everything deployed everywhere real 
memory usage will be large.
 One additional point. Assign Integration flows that are known to be 
heavy on memory usage to the minimum number of Integration Servers. This
 will reduce the number of Integration Servers that become large in size
 and require large amounts of real memory.
 Whichever policy you arrive at think ahead into the future and ensure 
it will still work if there are hundreds of more services added. 
Sometimes people assign one flow or service to one Integration Server 
when there are only a few services in existence. This is something that 
you can live with in the early days. But by the time there are another 
400 services it simply does not scale and the demand for real memory 
usage becomes an issue.
 



 
