Earlier this week, I started a series of discussions with the development team here at QuickLearn Training. These discussion included John Callaway, Rob Callaway, and Paul Pemberton, and centered around Best Practices for development of Microsoft Azure BizTalk Services integrations. The topic arose as we were working on our upcoming Microsoft Azure BizTalk Services course.
Best practices are always a strange topic to address for a technology that is relatively young, and at the same time rapidly evolving. However, the question comes up in both classes and consulting engagements – no matter what the technology.
With regards to BizTalk Server, we have years’ worth of real-world experience from which to pull both personally, and from our customers’ narratives. In addition, there are extensive writings in books, blog posts, white papers, and technical documentation. The same can’t be said, yet, of BizTalk Services.
This then represents an attempt at importing some known-good ideas and ideals into the world of MABS so that value can be realized. It is written not to codify practices and anoint them as “Best” without being proven, but instead to start a dialog on how best to use MABS both in present form, and with additional functionality that is yet to come.
NOTE: I am actively building out integrations to explore these ideas more fully, and expect to find that at least one thing I mention here either won’t work, or isn’t the best after all.
A Language of Patterns
For a brief period of my BizTalk life, I worked on the team that built the Enterprise Service Bus Toolkit 2.0. My job was to document not only the functionality as it was built, as well as sample applications, but also how existing Integration Patterns and SOA Patterns could be implemented using the Toolkit.
I explicitly recall that this last point was especially emphasized by the leader of that small team, Dmitri Ossipov. He wanted to communicate how integration patterns documented by Gregor Hophe and SOA patterns documented by Thomas Erl could be implemented using the ESB Toolkit.
Why would we spend time linking product documentation to patterns? Because Dmitri understood something important. He understood that buzzword compliance wasn’t enough to drive adoption of the platform (i.e., being able to say “this thing uses an ESB under the covers,” or nowadays, “this is cloud-based integration,” or “hybrid integration” [isn’t all integration a hybrid of something?]). Instead adoption would be driven as developers could see it solving a problem, and solving it elegantly – where each Itinerary Service, out-of-the-box or custom, implemented a specific integration pattern and composing them could solve any integration challenge.
So which problems were the best to try and solve? Probably the most common ones (i.e., the ones that are so common as to have vendor-agnostic industry-wide documented patterns for overcoming them). So what does that have to do with MABS? Everything.
The problem space hasn’t changed. The cloud has been quite disruptive in the overall industry – likely for the best. It has leveled the playing field to the point that the established players can be publicly challenged by marketing teams of smaller players that are brand-new to the scene because the scene itself is still new. At the same time, the art and science of integrating systems is the same.
The patterns for approaching this brave new world are remixes and riffs on the patterns that we’ve already had. As a result I’m going back to the fundamentals, and using it to understand MABS.
When I’m starting on a new integration with BizTalk Server, I first sit down and think of that integration in terms of the problems I’m trying to solve and well known patterns that can be used to overcome those problems. A nice visual example of such a pattern is reproduced here (from Gregor Hohpe’s list of Enterprise Integration Patterns):
Here we’re seeing the Scatter-Gather pattern, which is actually a composite of both the Recipient List pattern (1 message being sent to multiple recipients) and the aggregation pattern (multiple messages being aggregated into a single message). This is being presented in the light of a fictional RFQ scenario.
To see it further broken down, we could pull in just the Recipient List Pattern:
Or we could pull in just the Aggregator Pattern:
Once we’ve determined the patterns involved, and how they’re composed, it’s a fairly straight-forward exercise to envision the BizTalk components that would be involved and/or absolutely necessary in order to implement the solution.
For the purposes of this post, I’m going to see how specific patterns might be implemented in Microsoft BizTalk Server, as well as Microsoft Azure BizTalk Services, and Microsoft Azure Service Bus. Specifically, I will be examining the Canonical Data Model pattern, Normalizer pattern, Content-Based Router pattern, Publish-Subscribe Channel pattern, and even the Dynamic Router pattern.
This is not to say that these are somehow the only patterns possible in MABS, but will instead form a nice basis for discussion. Let’s not get ahead of ourselves though, as there’s still some groundwork to cover.
Seeing the Itinerary Through the Bridges
So many times as I’m reading through content discussing Microsoft Azure BizTalk Services (blogs, code samples, documentation, tutorials, books), I see lots of emphasis placed on a single bridge, but rarely on the itinerary as a whole that contains it. Essentially I’m seeing a lot of this:
What are we looking at here? A really boring itinerary that does not showcase the value of MABS, and makes it seem impotent in the process.
Inside the bridge though, there’s a lot going on: message format translation, message validation, message enrichment, message data model transformation, with extensibility points to boot – all put together in a pipes and filters style.
But what if I want to build something bigger than data-in/data-out where not everything has a one-to-one relationship?
Bridges Revisited In Light of BizTalk Server
In our BizTalk Server classes, we ultimately caution people against building these direct point-to-point integrations. Instead, we try to leverage the Canonical Data Model pattern when possible – with the Canonical Data Model being a Canonical Schema.
What does that look like in BizTalk Server? First of all, I personally like to extend a tad beyond this base pattern and mix in a little bit of the Datatype Channel pattern as well.
With that in mind, we would start with a Receive Port for each type of message (e.g., Order, Status Request), and a Receive Location for each source system and data type we expect to see (e.g., FTP with Flat-file, vs. WCF-BasicHttp with XML).
From there, we have a canonical schema for each of those types of messages. An internal Order message for example, or internal Status Request message. At the end of the Receive Port, the appropriate transform would be selected so that a message conforming to the canonical schema is published to the Message Box database.
At the Message Box, another pattern is at work, whether or not we ever wanted it. BizTalk Server is doing publish subscribe routing through the Message Box based on properties stored in the context of the message.
At this point each recipient of the message has their own Send Port through which a final transformation, to THEIR order format is performed along with a final pipeline before it is transmitted through the adapter.
What is all of this doing for us? It’s separating concerns and providing a loosely-coupled design.
I can have multiple sources for a message, and easily add a new source. To add a source is a matter of adding a Receive Location and a Map that translates the incoming message to internal message formats. To add a destination is just as simple and loosely coupled. It really is a beautiful place to live.
The Tightly Coupled Nightmare We’re Building for Ourselves
So why then, when presented with the same integration problems, do we use MABS like this? Is it just that what’s being published widely isn’t representative of the real-world live production solutions that are out there right now? Or is it that we’re blinded and can’t see the itinerary as a whole because we’re caught up with the bridge? I honestly don’t know the answer to this question. All I know for sure is that single bridge itineraries are not the answer for everything.
Why? Because this is a fairly tightly coupled design, and something that we might want to reconsider.
So let’s do just that. Let’s start though by following a message as it would flow through this system, while comparing how BizTalk Server would handle the same.
The whole thing starts with the Channel Adapter pattern.
In BizTalk Server, that’s the Adapter configured within each Receive Location that indicates from where the message will be transported. In MABS, it’s our Source.
Next we go through some filters:
In BizTalk Server, this is our Pipeline inside the Receive Location. In MABS, it’s most of our Bridge. WCF is also rocking this pattern (potentially earlier in the process) in the form of the channel stack, and both BizTalk Server and MABS give you hooks into deep customization of that WCF channel stack if you can’t get enough opportunities to touch the message.
For the BizTalk implementation of the Canonical Data Model pattern, we would then go through the Message Translator:
This is actually where things get interesting. In BizTalk Server, this is the Map that executes after the Pipeline, within the context of the Receive Port. For MABS, we’re still in the middle of the Bridge at this point. In fact for MABS, it’s one of the “Filters” to borrow a term from the pattern above, alongside other patterns such as message enrichment.
After this point, BizTalk enters a publish and subscribe step, and our simple, single-bridge, MABS itinerary breaks down. We won’t have another opportunity to prepare the message for its destination beyond our Route Actions – unless we’ve already prematurely made the leap and readied the message for only a single destination, ever.
Why does it break down? Because we’ve artificially limited ourselves to a single bridge. We’ve seen that bridges are highly capable, and have a lot going on, so we try to get it all done up-front. But is that a Best Practice even for BizTalk? Do we do all of our processing on receive just because pipelines are extensible and can host any custom code our hearts desire? Thankfully the model of BizTalk Server, centered around the Message Box, forces us to be a little bit smarter than that.
Standard best practices for BizTalk are helping us “Minimize dependencies when integrating applications that use different data formats.” Shouldn’t best practices for MABS allow us to do the same?
Crossing the Next Bridge
Consider the following Itinerary:
Here we have a situation where as I’m receiving my message, I don’t yet know the destination. In fact, I may have to perform all of the work in my first bridge before I know that destination. After I’m done with that first bridge though, I will have a standard internal representation of a “Trade” in this case.
From there, I am using Route Filters to determine my destination, and bridges to ensure the destination gets a message it can deal with.
More specifically, there are really four stages of the overall execution:
- The first stage is my source and first bridge. These are playing a role similar to that of a Receive Port in BizTalk Server, ending in a Map. I’m not necessarily taking advantage of every stage within the bridge, but I am certainly doing some message transformation, and maybe even enrichment. The main goal of this first stage is to get the message translated FROM whatever it looked like in the source system TO our internal representation
- The second stage of execution is between bridges. These are my routes, each with route filters that determine the next stage of the execution. They are playing a role similar to the Filters on my Send Ports in BizTalk, and yet at the same time they’re doing something completely different. Why? Because only one bridge will ever receive the message – even if both filters match, one will have priority. This is not true publish-subscribe, it is instead more akin to a Content-based Router of sorts. We’ll address in the next section how to get publish and subscribe happening for more even more loosely-coupled execution.
- The third stage is my second bridge, route actions, and my destination. These are playing a role similar to that of a Send Port in BizTalk Server beginning with a Map. I’m using this bridge to take our internal representation of the message and get it ready for the destination system.
- Sometimes we need dynamic routing to some degree even when we know the destination. That’s what might be happening through Route Actions in step 4. In this specific case, that’s not a true statement. However, if my destination were SFTP, for example, this would be my opportunity to set a destination file name based on some value inside the message.
True Publish-Subscribe In MABS
What if I were to tell you that MABS does publish-subscribe better than the Message Box database? Well it’s not true, so you shouldn’t believe it – like most click baiting headlines. In fact, MABS doesn’t do publish-subscribe at all. Rather, it relies on Microsoft Azure Service Bus to provide publish-subscribe capabilities – interestingly, a technology that we can also take advantage of through adapters first made available in BizTalk Server 2013.
So how do we do it? We publish to a Service Bus Topic, and receive through Service Bus Subscriptions (Queues that subscribe to messages published to Topics based on Filters). Consider the following itinerary:
We no longer have connections between the three seemingly independent sections of this itinerary. In fact, if we didn’t need to share any resources between them, they could be built as separate itineraries, in separate Visual Studio projects.
This is not new stuff, in fact Richard Seroter mused about the possibilities that Service Bus would provide in MABS all the way back in February.
Dynamic Sends and Receives
Another interesting scenario enabled here is the idea of a Dynamic Receive. What would that look like? Let’s say I have a subscription that I want to toggle or reconfigure at runtime. Maybe all trades for a specific company should result in a special alert message routed somewhere, based on conditions that are not known at design-time, and could change at any time. Let’s also say that I don’t want to re-deploy my existing bridge.
I can simply add another subscription to the topic for all trades related to that company, and deploy a new itinerary that handles the special alert message. But that doesn’t sound really dynamic in nature. In fact, it’s not bringing anything new to the table that BizTalk Server doesn’t do.
On the other hand, what if I had this alerting itinerary already built and deployed, and I want to dynamically route messages to it with conditions that change constantly at runtime (outside the scope of the messages themselves) – e.g., toggle a subscription for a given trade based on its current Ask price and the symbol in the trade message?
In that case, my special alert itinerary might be fed by a queue that receives messages forwarded from a subscription generated on the fly at runtime (because Service Bus can do that!), maybe even within a custom messaging inspector if I go fully insane.
Pain Points and Opportunities For Improvement
Ultimately we’re so close, and yet still so far away from having an ideal development experience with MABS. I’m happy that we have process visibility at design-time. Being able to visualize the whole message flow can be invaluable. On the other hand, I’m not happy with the cost at which this comes: sacrificing the ability to easily partition my resources on Visual Studio Project boundaries.
Separate Projects Per Interface
Ideally I would like to have a project per interface that I’m integrating with, and one common project that contains common schemas. I would also like to be able to split up the whole itinerary into smaller pieces (i.e., files) without losing the ability to visualize the itinerary as a whole.
Instance Subscription Equivalent for Two-Way Sources
In BizTalk Server, our two-way receives turn into two one-way exchanges as messages are published to the Message Box. There’s some magic built-in for correlation, but for the most part we aren’t forced into doing everything that follows in a purely request-reply message exchange pattern. Currently Microsoft Azure BizTalk Services does not offer the same capability. As a result, the nice publish/subscribe example itinerary above is only possible in the case of a one-way pattern.
So what are some practices that we can derive from this whole discussion? Here’s an incomplete list that I’m toying with at the moment:
- Build your cloud-based integrations with integration patterns in mind – don’t ignore the fundamentals, or throw away the lessons of the past
- Allow bridges to specialize on one portion of the process flow – don’t try to build your entire integration using a single bridge
- Use the canonical data model pattern when integrating between multiple source and destination systems when the list of those systems in subject to change
- Leverage the publish-subscribe capabilities of Service Bus when needed, don’t rely only on routes within MABS
- Push down to on-premise BizTalk Server to handle patterns that aren’t yet possible in MABS.
- Mediating mismatched of message exchange patterns, through BizTalk Server’s instance subscriptions for two-way receives (e.g., incoming synchronous request/response interface provided against asynchronous one-way request with one-way callback)
- Handling aggregation of messages, which will require the use of Orchestration
- Handling the O in the VETRO pattern (Validate, Enrich, Transform, Route, Operate), which again will require the use of Orchestration
- Handling the execution of Business Rules within BizTalk Server on-premise
I’m really looking forward to seeing how Workflow is going to fit into the picture. We’ll have to wait until Integrate 2014 in December to know for sure, but if we keep the fundamentals in mind, it shouldn’t be hard to figure out how they will fit into a concept of MABS itineraries rooted in the integration patterns that make them up.
I truly hope that you have enjoyed this post, and I thank you for reading to the end, if indeed you have and not simply skipped to the end to find out if I ever stopped typing. Remember that I want this to be a dialog, both retrospectively in terms of what is working and what isn’t working with your MABS integrations, as well as a forward-looking dialog as to what practices will likely emerge as generally applicable and ultimately most beneficial.