Wednesday, January 10, 2007

Aspects of a Platform Architecture: Part 2 - Evolution of a Platform

Again, the goal is to allow a small number of applications, that share some core processes/entities to interact in a loose way and ship and evolve as independently as possible in the face of changing science, user needs, infrastructure and developer/application allocation.

In the last section, I talked about what a platform architecture looks like as a static entity. However the reason for having a platform architecture is the evolution of the application suite over time.

With stable identifiers and the architectural components discussed previously, I have found that middleware can be used as a tool for coherent platform development. Again, I’m basing everything on stable identifiers.Without stable identifiers everything is very hard, with them some things are possible, sometimes even easy. In the absence of stable identifiers it is hard for any common substrate to get a leverage point that allows a clear value add to all of the products under development.

My definition of Middleware is a bit broader that that in wikipedia : Middleware is the enabling technology of Enterprise application integration. It describes a piece of software that connects two or more software applications so that they can exchange data.

For me middleware is also a place to incorporate the cross application business logic that allows users to interact with and see the data in a consistent manner. The middleware also shields custom applications from the hidden details of the database(s) or other persistent storage etc..

Some real life examples that I’ve seen in the life sciences include a situation in which the same substance may two different identifiers, the other is where the data is to be combined using a non-trivial algorithm e.g., geometric mean with outlier removal. In both cases the middleware served as the foundation for consistency since it was critical that all of the applications present the same information to users, and that there is only one implementation of the retrieval/calculation methods (for any non-trivial application the result given by two different implementations can skew over time).

Some of the considerations around method naming, signatures etc. are shared with library design and development. The best resource I know for addressing those considerations is Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries
by Krzysztof Cwalina

A conventional diagram of such a system appears below

A more accurate diagram, given the goal of supporting rapid system evolution is

Where the red links show ad-hoc connections which to support rapid development. My preference is to have the middleware be the responsibility of a single person, as it is the key leverage point for the long term evolution of the architecture.

This person is given the time not only to evaluate architectural ideas that come in from other members of the team who may have implemented solutions that should be made available to others in a slightly generalized fashion but also to examine what’s going on in the industry as far as standards, toolkits etc that will help long term product evolution. In addition, for the ad-hoc connections and implementations to be capable of being moved to the middleware as described below, it is important for this person to have influence upon the design of these interfaces. I’ve found it uniformly tempting for the application owners to embed too much information into their interfaces so as to simplify their short term development, thereby hindering long term development.

Still what I’ve described so still sounds a bit static -- how does it play out over time?

Evolution proceeds as follows:

Even if we start with the Platonic "conventional diagram” shown above, it will quickly evolve into something along the lines of the more realistic version which shows that some ad-hoc connections that have evolved over time to give the individual applications the flexibility to meet their requirements.

The next release of the middleware (1.1) is picked up by the Ensemble Review application. In this case the release supports functionality that had required and “outside the box” access solution for the Ensemble Review and so its access can now occur through the middleware. Green arrows show functionality that has moved to the middleware with the release shown.

The arrow from the Small Group Drill Down application to the Ensemble Review app is shown as now going through the middleware since my practice was to ship the middleware as a labeled jar rather than a web service. Although this had the downside of increasing the footprint of each application it did allow the the interface to remain very transparent.

The next release of the middleware (1.2) is picked up by the New Data Requests application. We now have three versions of the middleware in production, each application has shipped independently, but they are all moving in the same direction and there has been no forking for feature support -- forking for bugs is of course possible.

and of course as shown below an application can pick up the latest version without requiring any improvements.

And then the cycle repeats. The only time that a “synchronized ship” (that is when all applications ship to production simultaneously with the same version of the middleware) is required is when there is an incompatible structural change to the shared data structures or a core business process/algorithm. At this point everyone picks up the same version of the middleware, the shared data store is migrated and extensive testing occurs.

The advantages of this sort of approach include:
The testing time for each application is reduced. An application need not pick up a version of the middleware if the new version doesn’t provide any functionality/bug fixed that it requires (an underlying assumption is that there is adequate regression test coverage to assure that the functionality required by the the application is not broken in the new release).

When an application needs the new functionality it upgrades to the current revision.
This also means that the application is not dependent upon the new middleware functionality is not governed by its timelines shipping

This has the additional benefit of allowing middleware releases to be more focussed on the need of a particular product, or to engage the product architect to support early testing of a feature that will help them.

No comments: