Welcome to EMC Consulting Blogs Sign in | Join | Help

Dave Morris' Blog

BizTalk Time-Based Batching

Think I'm going to start to get a reputation for going on about the same sort of thing all the time - hey it's been the bugging me for weeks so I'm going to tell everyone all about it.

There is a scenario I've been working on recently where IDOC's generated by an SAP system need to be batch together to form a single output flat file for an AS400 system.  If you want more information about integrating BizTalk 2004 with SAP, see my colleague Tamer Shaaban's blog on the subject.

The issue here was how multiple IDOC's from SAP were correlated.  The correlation was simply IDOC’s of a particular type within a given time window, e.g. all “delivery” IDOC’s within a 10 minute window of the first one received should go in the same batch file. 

Easy really, correlate on the message type and use a sequential convoy to group them, in a loop until the time window completes.  This is ok and works most of the time but has a couple of issues that would make it hard to support:

1)                   If a single IDOC processing fails for some reason, the whole batch processing fails.

2)                   There is always a processing window (albeit small) where messages can be discarded (zombies).

The latter of these isn’t that obvious but while the orchestration instance is running, all IDOC’s of that type will be correlated to that instance, even once it has finished receiving and is heading towards its exit point.  The issue is you cannot “turn-off” a correlation set once you have received all messages you need into an instance of an orchestration.  Normally this would not be an issue as the number of correlated messages being received is a determinate number.  However, in this time-based scenario, the number of messages is totally indeterminate and there is no guarantee that after the last receive shape completes more messages won’t arrive and be correlated to the orchestration instance.

The solution to this to make the batch size determinate so that correlation is on a know number of messages and there cannot ever be zombies created.  Also by splitting the processing into 2 phases (a pre-process and a batch-process), processing failures for individual IDOC’s can be isolated from the batch.

The following diagram shows the basic pattern for this processing:

 

The key to this is the “repository” in the middle.  This is there purely for assigning each message received into a batch and determining when a batch completes (and consequently its size).  By using this mechanism we are now determinate in our batch processing and can guarantee no zombies.

 

Published 16 February 2005 15:04 by dave.morris

Comments

No Comments
Anonymous comments are disabled
Powered by Community Server (Personal Edition), by Telligent Systems