Selecting your Enterprise Architecture Repository

Intro

After several re-curring forum conversations I thought I’d could be helpful to document the steps taken in order to procure and establish our Enterprise Architecture Repository (EAR).

Of course this is by no means the only way to find the right EA Repository for you but it worked for us and we are still very happy with our choice.

Context

Probably like many other organisation we started off creating our EA assets in Microsoft Word, Excel and Visio diagrams and stored it on shared file servers and our Document Management System DocuShare.

The problem with this approach – as you know – is that there is no underlying content meta model which semantically links the architecture artefacts. The consequence is that analysis needs to be done manually. You can write macros in Visio, Word & Excel but I don’t think that is time and effort well spent for an Enterprise Architect.

The Team

To get a broader view of requirements across the business I assembled a team comprising the following:

  • 2 Enterprise Architects
  • 2 Solution Architects
  • 2 SOX Compliance Officers
  • 1 National Quality Assurance Officer

Due to many conflicting projects and activities and only 1 Enterprise Architect being the ‘business owner’ of the EA Repository, we ran into several conflicting resource schedules. As you can only objectively score if you sit through the presentations  and demos of all vendors, that has been a challenge.

Fortunately, one of the Solution Architects and the National QA Manager were really dedicated, so that we ended up with 3 different scores we could average. I recommend to involve also an IT Operations representative, so that the requirements of the Application Portfolio Management component are represented if that’s a use case for your EAR within your organisation.

The Process

You won’t get it 100% right. 1 year down the track we are using the EAR in ways we didn’t think of, but that’s only a good thing as we are reaping rewards beyond what we have envisioned.

After high level requirements gathering, the process we followed was:

  1. Market Research & Product Shortlisting
  2. Requirements lock-down & Weighting
  3. Product Demonstration & Scoring

Market Research & Product Shortlisting

The company had a ‘all you can eat’ arrangement with Gartner and Forrester Research. That made it easy to execute a quick market research. We also talked to fellow Enterprise Architects and opted to include one product which wasn’t on the Gartner Magic Quadrant.

Screen shot 2014-06-11 at 8.57.54 PMGartner and Forrester have quite a comprehensive selection of papers on this topic. The documents we found most helpful were:

  • Gartner: Understanding the Eight Critical Capabilities of Enterprise Architecture Tools
  • Gartner: Select EA Tools Use Cases Are Not Optional
  • Gartner: Magic quadrant for Enterprise Architecture

After reading through the documents, I had scheduled a call with a Gartner Analyst on that topic to clarify my understanding. I asked specifically why a tool like Orbus iServer is not mentioned in the Magic Quadrant paper as it has been recommended to us from other Enterprise Architects and we knew that Cathay Pacific is using it, too and that they are happy with it.
I learned that the Magic Quadrant selection process also includes things like disclosing the product roadmap to Gartner, Gartner specific certifications and customer references. Not all of those have been satisfied by Orbus (trading under Seattle Software) and hence it didn’t make it into the Magic Quadrant. For us not a strong enough reason not to look at this product, especially after it came with strong recommendations and it was fully compatible with our existing EA assets which have been created with the Microsoft Office Suite.

The Magic Quadrant for us looked as per below screenshot at the time of evaluation. I recommend to get the latest report from Gartner if you’d like the latest view.

Screen shot 2014-06-11 at 9.29.06 PM

The Product Shortlist

After a first high level evaluation of the products in the market, research papers and recommendations we shortlisted the following products (vendors):

  • ARIS (Software AG)
  • Abacus (Avolution)
  • iServer (Orbus)

At first, alfabet was not on the shortlist. Software AG has just had acquired this product through the acquisition of planningIT. The Software AG technical representative offered an introduction and demonstration at short notice which fitted our schedule, hence we agreed to have a look at it as well. After the demo it was clear that this product is not what we are looking for in an EA Repository due to its rigidity of the prescribed process and the absence of a content meta model. I also downloaded iteratec’s iteraplan for a quick evaluation but found the tool not very user friendly.

Requirements Lock Down & Weighting

The evaluation group defined the evaluation criteria categories and weighting as follows:

ID Description AVG Weight
1 Repository & content meta model – capabilities & fit 8.8
2 Modeling – support for business process and EA modelling 9.4
3 Gap Analysis & Impact Analysis – ease of use, capabilities 8.4
4 Presentation – automatic generation & capability 7.2
5 Administration – system and user access administration 6
6 Configurability –  usage, access rights, output (not including content meta model) 6.8
7 Frameworks & Standards support – e.g. TOGAF, eTOM, BPMN, reference architectures 6.6
8 Usability – Intuitiveness of UI and administration 8.4
9 Adoption/Change Management – effort to roll-out and adopt 9
10 Fit for Purpose (Use case eval, risk, compliance, business requirements, customer centricity) 9
11 Extensibility / Integration ability with other systems 7.4
12 Vendor: Interactions – Responsiveness, quality, detail, professionalism, support 6.2
13 Supports Risk & Compliance (e.g. JSOX) tasks/initiatives 6.8
14 Supports Quality Management (ISO9001) tasks/initiatives 6.6
15 Gartner Research results & recommendations for suitability 4.6

The weight semantics were defined as: 0 – Irrelevant; 1 – Insignificant; 2 – Fairly unimportant; 3 – Somewhat unimportant; 4 – Nice to have (e.g. ease of use); 5 – Nice to have (increased productivity/efficiency); 6 – Somewhat important; 7 – Important; 8 – Fairly important; 9 – Very important (represents key requirements); 10 – Critical/extremely important (failure to satisfy requirements in this category will cause elimination of product)

Our Requirements

ID Category Description
1 10 Repository must be shared and accessible by all EA practitioners, Solution Architects, Business Analyst and Business stakeholders
2 1 Must allow for customised meta models
3 10 Existing assets (.ip – process files & visio diagrams) need to be converted and linked into meta-model
4 10 Built-in Version Control
5 11 Integration/Linkage with requirement system
6 11 Integration/Linkage with other systems WIKI, DocuShare, FileFolder
7 8 Must be able to deal with a large number of artefacts (10,000+) & performance tuning options
8 2 Must be able to understand & analyse complex (ontop, links, semantics of an relationship, 1:n, m:n) relationships between artefacts
9 2 Support Scenario (what-if) planning & scenario modelling
10 4 Support multiple/different stakeholder viewpoints & presentations
11 2 Facilitate implementation of business strategy, business outcomes and risk mitigation
12 2 Repository supports business, information, technology, solution and security viewpoints and their relationships. The repository must also support the enterprise context composed of environmental trends, business strategies and goals, and future-state architecture definition.
13 2 Modelling capabilities, which support all architecture viewpoints (business processes (BA), solution architecture (SA))
14 3 Decision analysis capabilities, such as gap analysis, impact analysis, scenario planning and system thinking.
15 4 Presentation capabilities, which are visual and/or interactive to meet the demands of a myriad of stakeholders. Presentations can be created via button click.
16 5 Administration capabilities, which enable security (access,read,write), user management and modeling tasks.
17 6 Configurability capabilities that are extensive, simple and straightforward to accomplish, while supporting multiple environments.
18 7 Support for frameworks (TOGAF, COBIT, eTOM), most commonly used while providing the flexibility to modify the framework.
19 8 Usability, including intuitive, flexible and easy-to-learn user interfaces.
20 2 Draft mode before publishing edited and new artefacts
21 1 Supports linking Business Motivation model ((Means)Mission, Strategy, tactics >>> (Ends) Vision, Goals, Objectives)
22 2 Needs to support multiple iterations of TOGAF (Architecture Capability, Development (AS-IS, TO-BE, Gap), Transition, Governance iterations)
23 2 Support for multiple notations(Archimate, UML) connecting semantics to the same content meta model
24 10 Repository Search and Browse capability for the entire organisation
25 3 Creation of Roadmaps
26 3  AS-IS, Transition & TO-BE state based gap analysis across processes, information, technology, Business reference models, application architectures and capabilities
27 10 Reverse-Engineering/Introspection Capabilities for Oracle eBusiness Suite/ERP
28 6 Ease of Editability of meta model relationships
29 2 Support for linking Strategic, Segment & Capability Architectures across architecture models, processes and roadmaps
30 6 Ease of Editability of meta model objects & attributes
31 3 Strategic, Segment & Capability architectures need to be referenceable across all models, building blocks and projects
32 8 Lock-down/Freezes on changes
33 8 Role based edit/view/read/write access restrictions
34 5 Administration & Configuration training will be delivered by vendor
35 10 Price within budget
36 3 Supports “is-aligned-with-roadmap” analysis via button clickc
37 7 Supports library concepts (lock down/freeze) for reference models, refrence architectures, Architecture/Solutions Building Blocks
38 9 Vendor has proven capabilities/support for change management efforts associated with the roll-out of a EA tool/repository
39 2 Supports multiple notation standards (Archimate, UML, TOGAF)
40 10 Preserves meta-data of exisiting FXA process model (mappings to Software and applications are imported and semantically correctly mapped)
41 10 preserves & understands meta-data of exisiting FXA visio models (BRM, Capability, etc)
42 11 Integration with Portfolio Management tools and Project Management tools
43 12 Alignment of what FXA needs with Gartner analysis
44 12 Provides technical/customer support during Australian business hours
45 12 Vendor pays attention to FXA requirements and business environment and build demos, questions & dialogues with FXA around it.
46 13 Must have different role-based user access levels for read, write, administration and public (browse) for different types of assets
47 13 Must not allow users to sign up for inappropriate level of access without permission
48 13 Writes access logs for successful, failed logon and user profile/role change logs
49 10 Supports the modelling, documentation, query & analysis of Customer Touchpoints

The Result

Once we finally received a quote we realised that it was beyond our budget hence we had to remove ARIS from the shortlist.

After use case demonstrations from the remaining vendors the evaluation team scored independently and came up

Abacus TOTAL 2399
iServer TOTAL 3582.2

This concluded the evaluation and made Orbus iServer a very clear choice for us.

Next Steps to Consider

  • Decide a content meta-model (TOGAF VS Archimate)
  • Repository structure & library setup to support automated roadmaps and gap analysis and to support projects
  • Import Application catalogue (application & interfaces, live date & status) (AS-IS)
  • Import existing EA assets (AS-IS and TO-BE): processes, Business Functional Reference Model, data models

Things to be aware of – Before you jump

  • Resourcing: There will be people necessary to administer, maintain and continuously update your Enterprise Architecture repository. Whenever there is a a large change coming which impacts your EAR, you need to understand that this can be a full time job for a little while
  • Licensing: Make sure your business case caters for growth and additional licenses. In case of iServer you need Visio and iServer seat licenses.
  • Training: Ensure you got a team that you can work with to roll out training. Especially across different domains: Business (BPMN, process modelling guidelines) and meta model extensions (eg Interfaces, RICEFW) and the correlating relationships.
  • Publish guides and reference material (we found a WIKI most useful!)
  • Standards & Reference models: You will have to spend time and effort to define your own standards (eg subset of BPMNv2.0 or APQC PCF)

A Data Quality Firewall Architecture

A Data Quality Firewall Architecture

Organisations know that Data Quality (DQ) is a key foundation for a successful business. Business Intelligence, Reporting, Analytics, Data Warehouses and Master Data Management are pretty much wasted effort if you cannot trust your data. To make matters worse, systems integration efforts could lead to ‘bad’ data spreading like a virus through your Enterprise Service Bus (ESB) across all systems, even into those which had fairly good data quality initially.

This article discusses the architectural concept of a Data Quality Firewall (DQF) which allows organisations to enforce their data quality standards to data in-flight and anywhere: Up-Stream, In-Stream and Down-Stream.

Data Quality Lifecycle

When data enters an organisation it is sometimes referred to as ‘at the beginning’ of the data quality lifecycle. Data can enter through all sorts of different means, e.g. emails, online portals, phone, paper forms or automated B2B processes.

Up-stream refers to new data entering, whereas in-stream means data being transferred between different systems (e.g. through your ESB or Hub). Down-Stream systems are usually data stores that contain already potentially unclean data like repositories, databases or existing CRM/ERP systems.

Some people regard a Data Warehouse to be at the end a data quality lifecycle, meaning there is no further data quality work necessary as all the logic and rules have been already applied up, down or in-stream.

However if you start with your DQ initiative you need to get a view of your data quality across all your systems, including your data warehouse. You achieve this through profiling. Some software vendors offer ‘free’ initial profiling as a conversation starter, maybe a worthwhile first step to get your DQ indicators.

Data Quality Rules and Logic

Profiling vital systems allows you to extract data quality rules which you can implement centrally so that you can re-use the same rules and standards enterprise (or domain) wide. The profiling equips you with data quality indicators, showing you how good your data really is, on a per system basis.
Understanding your business processes and looking at the data quality indicators, enables you to associate a $-value with your bad data. From this point onwards, it’s probably very easy to pick and choose which systems/repositories to focus on (unless your organisation is on a major strategic revamp, in which case you need to consider the target Enterprise Architecture as well).

Another question always was when and how to control the quality of the data. In the early days we began to implement data quality with input field verification, spell checks, confirmation responses and Message standards (e.g. EDIFACT). Organisations then found themselves duplicating the same rules in different places, like front-end, middleware and backend. Then field length changes come along (a la Year 2000 or entering global markets or through mergers and acquisitions) and you had to start all over again.

At the last APAC Gartner conference in Sydney I heard people suggesting that the data quality rules only need to be applied to the warehouse. I personally think this can be dangerous and needs to be evaluated carefully. If there are no other systems that store data apart from the warehouse this might make sense. In any other case it means that you cannot trust the data outside the warehouse.

Zooming In – The Data Quality Firewall

A DQ firewall is a centrally deployed system which offers domain or enterprise wide data quality functionality. The job of this system is to do cleansing, normalisation, standardisation (and possibly a match and merge if part of MDM).

In an Event Driven Architecture (EDA) all messages are simply routed through the data quality rules engine first. This is done by the DQ firewall being the (possibly only) subscriber of all originating messages from the core systems (#1 in the image). Subsequently, all the other interested systems subscribe to the messages emitted from the DQ firewall which means they are receiving messages with quality data (#2).

Data Quality Architecture Diagram

The diagram shows the Core Systems as both publishers and subscribers, emitting an event message (e.g. CustomerAddress) which is picked up by the Semantic Hub. The Semantic Hub transforms it into the appropriate message format and routes it to the Data Quality Firewall. The DQF then does it’s job and re-emits the CustomerAddress message with qualified data, which is then routed to the subscribing systems via the Semantic Hub.
Subscribing systems can be other systems, as well as the system that originally published the data.

In an SOA scenario the architecture is similar, using service calls to the appropriate service offered by the Data Quality engine. Care needs to be taken if the DQ service is required to partake in transactions (see time-outs, system availability, scalability, etc for more details).

Your New Data Quality Ready Target Architecture?

Benefits of a centrally deployed Data Quality engine are re-usability of rules, ease of maintenance, natural consolidation of rules, quick response to change, pervasiveness of change, assignment of ownership and responsibility to Data Stewards (who owns which rules and the associated data entities).

Things to regard are feedback mechanism (in case of bad data) to the originating system as it might affect current system design or the introduction of a Data Quality/Issue Tracker Portal which allows Data Stewards to intervene in cases were it cannot be done automatically.

The overhead which distributed approaches like a input field validation on multiple systems can have, makes a central Data Quality Firewall architecture by far more Enterprise ready, delivers more long-term benefits and is cheaper in terms of setup and maintenance and ROI.

The Small Bang Approach

The beauty of the EDA approach is that you can easily change the routing on a system-by-system or message-by-message basis through the Data Quality Firewall. You simply change the routing of the messages in the routing table of the Semantic Hub.

Example of message type ‘CustomerAddress’ emitted through SystemA below:

System B and C are subscribing to CustomerAddress messages emitted from SystemA.

Message Type

Content Based Routing

Subscribers

CustomerAddress Xpath(/Originator)=’SystemA’ B, C
Account Xpath(/Originator)=’SystemC’ A

To enable the Data Quality Firewall functionality we change the subscriber to DQF. From then on all CustomerAddress messages from SystemA are delivered to the DQF. After the data quality rules have been applied by the DQF SystemB and C are then getting clean and trustworthy data. The Account data remains unchanged in this example.

Message Type

Content Based Routing

Subscribers

CustomerAddress Xpath(/Originator)=’SystemA’ DataQualityFirewall (DQF)
CustomerAddress Xpath(/Originator)=’DQF’ B, C, A
Account Xpath(/Originator)=’SystemC’ A

A possible next step could then be to quality control the Account message data. This approach allows you to consolidate your Data Quality step by step across your entire organisation.

Cheers,
Andy

Business Oriented Enterprise Integration

After having examined low-level concepts and implementation specifics of Scalable Enterprise Integration Architectures earlier, this article is exploring the higher-level concepts and how it all (EDA, SOA, REST, Message Queues, Request/Response that is) comes together.

We shall call it ‘Business Orientated Enterprise Integration’ (BOEI).

Baffled or Just NOT Aligned at all?

Those of us who love to play ‘Buzzword Bingo,’ during meetings or conferences, know that the phrase ‘Business and IT Alignment’ scores frequently. I am surprised how many organisations still disregard it after having spoken about it for so long (Maybe we are secretly chasing high scores at Buzzword Bingo still? 😉 )

How do you know whether IT is aligned with business?
Does the organisation have an Enterprise Architecture (EA) practice – or maybe a vision statement from the CEO, with buy-in from senior & middle-management? Or a strategic roadmap that everyone knows about? What about reference architectures, published standards?

If the answer is No to most of those questions above, I doubt you’ve got business and IT alignment, simply because you have no way to execute according to common sets of standards and rules.

TOGAF talks about things like Architecture Repository, Enterprise Continuum and references to industry or organisational standards, patterns, guidelines and rules. Establishing those allow organisations to walk in the same direction as one entity, by reviewing initiatives according to those standards. And Yes, I am optimist enough to assume that people from the business end of town are communicating with the guys with the thick glasses in the dark rooms to discuss those matters together.

Divide & Conquer: Defining your Business Domain

Business-Oriented Enterprise Integration works for organisations of all sizes. Once you reach a certain size or your business is organised naturally in distinct units you might want to consider dividing the whole into more manageable chunks, or Business Domains, as depicted in the picture.

The same is true for M&A scenarios where you either want to sell parts of your business or integrate acquired businesses more easily.

Example of a Domain

A smaller Super Fund or Retailer might just have a single domain, whereas a globally operating bank can have separate domains split up by country, payments, institutional or retail banking.
Even a ‘Customer’ domain providing a single-view of customer to your organisation or a single-view of your organisation to the customer, is possible.

The major point however is: The Business defines a domain, its capabilities, the systems contained within and hence its boundaries. The business also owns the Domain and the systems, not IT (collaboration models between IT and Business, e.g. running IT as a charged service centrally, project based or as part of a business unit is another topic).

The Core: Semantic Hub

One of the major advantages of the BOEI approach is semantics.
Service Orientation forces core business systems to understand the semantics of the world of the target system or at least the ESB, which is NOT it’s job.

For example:
Core System A only deals with Account ID and Last Name; it then needs to understand how that translates into the ESB (canonical) or worse (because it is point to point) target system world. If canonical services are employed, each Core System Service has a Canonical representation of the very same service on the ESB, which doubles the amount of services (see ‘EDA vs SOA’ for details) and Version Hell has arrived.

The Semantic Hub decouples Source and Target Systems through well-defined message formats and the 3-forms-2-transforms pattern (see John Schlesinger’s ‘The Five Golden Rules of Integration’ for details).

The 3 message formats (forms) are:

  • Source System Message
  • Canonical Message
  • Target System Message

The 2 message transformations occur between the

  • Source System Message format to the Canonical Message format
  • Canonical Message format to the Target System Message format

It sounds very simple, but let’s be honest: Message modelling is an art that needs to be understood, especially when complex 1-to-Many, Many-to-Many or reverse entity-relationships are modelled (i.e. Customer-to-CustomerAddress or Account-to-Customer and Customer-to-Account).
[Please contact me and share your experience with tooling in this area!]

Message formats are contracts between systems owners (business people that is). Those formats need to be adhered to. The Business Domain owner facilitates the definition of those formats and enforces compliance through the Semantic Hub. The compliance comes naturally because it is the foundation of an operating business.

System Design and System Integration

For all the reasons discussed in ‘Event Driven VS Service Orientated’ the main integration and communication pattern between systems is Event based.

Because Businesses are driven by events. Naturally. Purchase order, invoice, delivery, payments, etc is what makes businesses tick.

BOEI thinking affects system design, because in an Event Driven Approach (EDA) systems are expected to do their job commonly without depending on anyone else.
That means a system’s data model must be able to store all the data necessary. For example if the Account System needs customer data to verify against when accounts are altered, closed or opened then that system must have that data available. A getCustomerData service call to a CRM system would defy the scalability advantage as discussed in ‘EDA VS SOA’.

The downside of this approach is that data might be stored redundantly. This is counter acted by the fact that the business owns the systems and it’s data and hence decides on what data to store plus the level of data quality necessary for the system to function accordingly.

The Fringe: Inter-Domain Communication and Presentational Systems

Inter-Domain communication adheres to the same principles as system-to-system message exchange within a domain (intra-domain).

This means event messages from source systems must be routed through the Semantic Hub before the message reaches an inter-domain gateway. Otherwise it would be a point-to-point integration between system and a domain. And the principle we must adhere to is loose-coupling.

Information retrieval for Presentational or Channel systems is commonly powered through Request-Response type communication. Above picture shows how we take the load off core systems by feeding presentational applications through Information Services. The data repositories that power Information Services are fed by canonical event messages directly from the Semantic Hub. Those data stores can be optimised for read-operations.

REST and JSON further enhance information retrieval through a standardised way to query resources (entities) and reduced response payload.

Functionality (e.g. Internet Banking money transfer) beyond simple information retrieval is implemented by re-using existing message formats from inter-systems integrations as described above.

How is this (more) Business Oriented?

  • Systems and domains are organised according to business operations
  • Business events are used naturally and not ‘clouded’ trough artificial services.
  • Systems and domains do not have to understand the semantics of other interfacing parts of the business with different semantics
  • The business owns systems and domains
  • Business domain owners enforce message compliance naturally
  • Systems do not depend on other systems to do their job
  • Core business functionality is decoupled from Channels

Is BOEI different to what you have got at the moment? Then ask yourself: Where is the money going to come from (Vision & Business Architecture Phase within TOGAF) ?

Look at what your current architecture offers and how it restricts or empowers the business (Baseline Architecture). Then define your Target Architectures according to the Vision & Business Architecture. Finally put priorities on the tasks coming out of your Gap Analysis.

That should give you a good understanding of what’s next on your agenda.

Conclusion

The approach described in this article is not new. It’s simply Best-Practices applied across an enterprise, using proven methodologies.
Unfortunately, in quite a number of organisations it seems as if the incumbent software vendors have a strong say on Strategy & Architecture, pushing for the latest software stack. And because most vendors have been very busy in recent years to make their entire software stack SOA ready, they don’t really want to talk about anything other than SOA.

But there was a reason why Message-Oriented-Middleware and Message Queues were invented in the nineties. And they still are a great team! And that’s why we should favour proven and purpose built technology to equip our organisations in the best possible ways.

The BOEI concept hopefully encourages you to have a closer look at your organisation, it’s goals, architectures, systems, data and processes in order to help you choose what will work best for you – instead of blindly following pretty marketing slides.

Selamat Tinggal from GA715 (just about to leave Northern Territory now),
Andy


Groundhog Day with SOA Architects

Sometimes I feel a bit like Bill Murray in Groundhog Day (except he gets paid better) since I started to propose, what I believe to be a better alternative, to pure Service Oriented Architecture as an approach to Enterprise Integration.

Let’s say the meeting starts at 10am and I arrive on time (because I am German and therefore I can ;-)) and then it goes like this:

Other: Hey Andy, I heard you are against SOA, what’s wrong with it?

Me: There’s nothing wrong with SOA there’s just a better alternative for Enterprise Integration.

Other: Better in what?

Me: Scalability for example, SOA doesn’t scale as well as it should if it’s the backbone of an Enterprise Integration strategy.

Other: What do you mean, we just need to add some more nodes to our ESB cluster, if load increases.

Me (Thinking: That’s exactly what I am talking about!): It doesn’t scale well, because it mainly relies on Request/Response which creates double the amount of network latency, it depends on the Service providing system to be up and running and it uses distributed transactions or compensation flows and hence requires undo-service implementations. And you can’t predict the amount of service calls.

Other: That’s why SOA recommends a strong Governance, so that we know who the consumers are.

Me: You are still not able to predict the exact number of calls at any given point in time, plus once your service endpoint is made public there is little enforcement for who calls it.

Other: But you can force authentication/authorisation on service calls.

Me: I don’t think authentication & authorisation should be part of a standard Integration scenario inside an Enterprise.

Others: Services don’t have to be Request/Response, a [one-way] queue can be a service endpoint too!

Me (Thinking: I wonder how a getCustomerData service call might get the customer data back then!?): How does a getCustomerData service return the customer data then?

Other: <Silence>. But it doesn’t need the service providing system to be up and running if you put the service request in a queue.

Me: But then you will have to correlate the asynchronous response to the request each time. The underlying issue with SOA is that it depends on other systems to do something, whereas the Event Driven approach doesn’t.

Other: We can just use an Orchestration Engine to support long running tasks.

Me (Thinking: Funny that those Orchestration Engines always show up in a integration discussion!?): Orchestration introduces state, and that’s no good in Enterprise Integration, because it doesn’t scale and failover easily and hence creates unnecessary overhead.

Other: You don’t want a BPEL Engine!? How do you orchestrate business processes then!?!!!

Me (Thinking: Oh no, not again!): Event Driven Architecture doesn’t need an Orchestration Engine. Whenever a business event happens, all systems that are interested in that event subscribe to it.

Other: But what about this process: <Other describes passionately a business process>…

Me (Walking to the whiteboard, throwing the first 2 markers in the bin because they don’t work): <Whiteboarding>…

Other: But what about that process: <Other or friend of other describes a more complex business process more passionately>…

Me: <Whiteboarding continues>…

Other: Ok, that was an easy example, but what about that one …

Me: <Whiteboarding continues>…

(The above “But what about/whiteboarding”-loop happens usually between 2 and 5 times) until “Other” or “Friends of Other” cannot come up with a more difficult process.

Other: I will have a think about it and come back to you with an example that cannot be solved without a BPEL Engine.

Me (Thinking: No, you won’t): Sure.

We are now already somewhere between 30 and 90 minutes into our meeting time, we have been kicked out of a meeting room at least once, because other people have booked the meeting room after us, but now…

…the meeting finally starts!

Have a great day!

Successful SOA Projects

Frankly speaking, I am a bit confused.

Gartner is saying that 75% of SOA projects are failing, whereas others (especially those who work on SOA projects) are claiming that they have delivered successful SOA projects.

What are we trying to do?

From previous articles you might remember that I am a believer in context: Are we using SOA for Application Integration? Are we writing new applications? Or is our goal to Integrate an Enterprise?

However, in this case the different contexts have a lot in common if it comes to quantifying and qualifying the success of SOA.

The following questions should help you to examine whether a SOA rollout is successful or not:

  • How many Services are there in total (provider and canonical form)?
  • Are there enough services to actually call it a SOA based solution?
  • What was the total build time per service?
    Maintaining the current speed, is it a good enough TTM?
  • How many of those services are re-used across Enterprise/Domain?
    If the solutions are just departmental, why are you calling it ESB then and where is the benefit of re-usability gone?
  • Do Services exist in an original and canonical form?
    If there is no canonical representation of a service you are tightly coupling domains and their systems.
  • How is the Service lifecycle governed?
    IBM has found that SOA Governance is necessary for a successful project; you might want to stop and (re-)define your governance first.
  • Who owns those services and how is a change governed?
    Who is paying for the services? And what happens if composite services have different SLAs in place? You will pay for the SLA upgrade?
  • Who owns orchestrated services which use services from different owners?
    (see above)
  • How many versions per service are there?
    Has version hell already begun to manifest itself?
  • How is transactionality handled across services (WS-AT, No Tx)?
    Do scalability tests and find out whether the current approach is scalable enought – the sooner the better!
  • How is compensation handled? How complex is it?
    Often compensation handling is more complex than the services itself. If that is the case you need to stop and ask whether the current SOA approach will be working for you in the future.
  • What do those services do? Execute functionality (Write) or query information (Read-only)?
    Information/Query services are ok for representational gateways, not if they increase system inter-dependencies. If service calls change state, check out the point on scalability and transactions (above).
  • How is determined what the max usage load will be during peak?
    You will need that to determine whether your approach is future proof.
  • How many service consumers are there per service?
    If there are only 1 or 2 consumers per service, ask yourself  Why? Are there too many versions, is it not re-usable or not mature enough yet?

If we get answers to above questions, then we have a pretty good understanding where the SOA rollout is at:

  • Are there only 3 read-only services (too early to make a call)
  • Do we have 10 different versions per service (version Hell is already on it’s way)
  • There are no clear ownership boundaries and contracts in place (Governance issue)
  • Etc.

Hope all is well!
Andy

Scalable Enterprise Integration: Events or Services – A Comparison

I am currently finding myself having discussions about the best way to provide globally operating Enterprises with a scalable Integration Architecture.

There is the Event Driven Architecture (EDA) approach, which has been around and proven for quite some time now. I believe it was one of the reasons Message Oriented Middleware was born and why IBM came up with MQ Series in the early 90s.

On the other hand there is the Service Oriented Architecture (SOA) paradigm that is nowadays lived and breathed by many vendors which have ‘SOA Ready’ software stacks to sell.

In order to not get opinionated for the wrong reasons, I had a look at the basic principles and underlying technologies.

And I was quite surprised by the result…

Consistency Model: BASE VS ACID

Verdict: EDA.
According to the CAP theorem, out of network partitioning, availability and consistency you can only achieve two at the same time. That is why Amazons IaaS offering applies BASE (Basically Available, Soft state, Eventually consistent) across their globally distributed infrastructure.
A global enterprise faces similar challenges. Where systems are distributed across countries and continents, BASE is the way to go in order to integrate a global enterprise.

ACID (Atomicity, Consistency, Isolation, Durability) is important for applications and systems which have immediate consistency as a paramount objective.

Scalability: High VS Low

Verdict: EDA.
SOA is emphasising on service calls, which creates dependencies on all service provisioning systems to be up and running and requires distributed transactions (e.g. WS-AT) or additional compensation flows. It is hard to predict how many service calls a service provider needs to cater for during peak load.

EDA uses local transactions to put event messages on a queue. The routing of messages is done via managed subscriptions, which allows to examine the exact number of subscribers.

Communication: Asynchronous & Message Oriented VS Synchronous Request-Response

Verdict: EDA.
Albeit it is true, that SOA can be operated over messaging infrastructure (e.g. SOAP/MQ or SOAP/JMS) the real problem lies in the request to do something and the associated wait for a response, which means systems depend on other systems to complete a task and respond (see also scalability).

Another drawback on SOA is that it is impacted twice by network latency, because there is always a service request plus it’s response whereas EDA doesn’t need a response if there is none required.

Messaging Infrastructure Setup: Complex VS Simple

Verdict: SOA.
On a pure Messaging Infrastructure level SOA/HTTP wins over EDA/MOM, because of technology like DNS, which help to automatically resolve service end points. To be completely fair though we need to note that we are really comparing two  pre-existing infrastructure assets (HTTP/DNS) with Message Queues, which is more a foundational layer of both paradigms. However if you go down either of those two routes you need to be aware of what you can re-use.

Endpoint Layer: Infrastructure VS Application

Verdict: EDA.
In terms of (OSI type) layering, a Service endpoint requires more layers then a message endpoint.

A Service usually requires network, DNS, platform (Java/.NET), App Server, Web Server.

A message endpoint requires network and a queue infrastructure.

Ownership: System Owner VS Multiple

Verdict: EDA.
Because SOA believes in re-use through orchestration, an orchestrated service can involve multiple services provided by multiple system owners.
This can create conflict in terms of ownership and responsibility to adhere to a specific Service Level Agreement (e.g. Platinum Services).

In EDA the system owner of an event message emitting system is clearly defined as the owner of that message. Any type of changes to that message need to be agreed on with this owner.

State: Stateless VS Stateful & Stateless

Verdict: EDA.
Let’s be clear: In Enterprise Integration, you do not want state. Application architecture might need state, but even there, a good application architect will try to avoid it because it is expensive and does not scale or failover easily. That is one of the major problems with service orchestration as soon as it requires compensation flows or distributed transactions. And that is why a Service Orchestration Engine might be of great use as part of an application or system, but not in Enterprise Integration.

In Case of Error: Local Transaction Rollback VS Distributed Transaction Rollback or Compensation Flows

Verdict: EDA
EDA relies mainly on local transactions for message delivery and pickup. Even if there are transactional clients involved the local system context remains, because – unlike SOA – there are no calls to distributed services and systems necessary.

SOA compensation flows add complexity through ‘Undo’-services in original and canonical form.

Decoupling: Canonical Message Form & Hub VS Canonical Service Endpoint & Service Bus

Verdict: Draw

Interface Contract: Message VS Service

Verdict: Draw

Provider Uptime Dependency: No VS Yes

Verdict: EDA
Another major drawback of SOA as an Enterprise Integration approach is it’s dependency on all the service provisioning systems. The service providers need to be up and running and must have enough spare capacity to service the request.

EDA on the other hand notifies other interested systems of an event that has happened. Those systems are free to pick up the event message when they are ready and more importantly it does not prevent any other system from completing it’s task.

Consumer Uptime Dependency: No VS Yes

Verdict: Draw/EDA.

If the service response is delivered asynchronously then SOA and EDA behave in the same way.

In case you are using synchronous service invocation (e.g. over HTTP) the EDA response message can sit in a queue until the receiving system is ready to pick it up, whereas the SOA response might never be received.

Transaction Handling: Local VS Distributed

Verdict: EDA.
See “In Case of Error” above.

Message/Service Peak Load Management: Subscription Based/Predictable VS Random/Unpredictable

Verdict: EDA.
If a system X does something (e.g. Account create) 500 times per minute then we know that we will receive 500 notifications during that minute. The Integration Hub knows how many receivers this event notification has because systems subscribe to it.

If system X now calls a service 500 times then the provider of this very service might or might not have the capacity to deal with it because there is no reliable way of predicting how many service calls that provider will get during the minute from other systems.

Network Traffic & Latency: One Way Message Delivery VS Request/Response

Verdict: EDA.
Event Driven Architecture, in most cases, delivers a message to the underlying messaging infrastructure and is done (fire & forget).

Service Oriented Architecture  works in a Request/Response paradigm and is hence affected twice by network latency.

De-Comissioning of Event/Service Provider: Simple VS Potentially Complex

Verdict: EDA.
Because there is no clear understanding of who is calling particular services, turning off services might have unwanted effects across the enterprise.

In case an Event emitting system has been turned off the event simply will no longer be received by other systems. Alternatively the same event can be emitted from another system if the business deems that event necessary.

Summary

It’s a quite clear Pro-Event Driven Architecture result, which is a bit of a surprise given the fact that there is still a lot of hype and products (Which one was first – hype or product?) around SOA.

But it does not mean that SOA is inherently wrong or faulty. The context for this comparison is very clearly set to ‘Highly Scalable Enterprise Architectures’ for globally operating Enterprises. There might still be a valid reason to go SOA if the context provides a fit.

Cheers,
Andreas Spanner

Related Links:

Banking, Integration Architectures and Customer Data Ownership

Gartner Analyst Alistair Newton raised data ownership as one of the new main concerns for CIOs in the Banking industry. His view ecompasses handing value-added data back to the Customer for re-use on other related services. Banks need to be managing data internally in a way that the customer can access and update the part that he/she knows best about, like Name and Address for example.
This imposes demands on a banks ‘Single View of Customer’. More importantly it demands (the right!) MDM, Data Quality and Integration Architecture.
In this article we will explore different types of Integration Architectures and come up with a best fit for purpose.

A New Paradigm

“Customer Own Their Data” is quite remarkable considering the MDM and Data Quality discussions in recent years. Not everyone was convinced a MDM project is necessary at all. A lot of focus was put on Functional Integration (i.e. Services/SOA), less so on true Data Integration (MDM/DQ).

For us Customers to own our data makes sense. If my name is ‘Andreas’ but I want to be called ‘Andi’ – from a Customers perspective the bank should comply with that wish. If you then look at the systems landscape inside of large enterprises only few seem to be ready to satisfy a simple wish like this. And the reasons for this are plentyful:

  • Failing or incomplete SOA projects
  • New data aggregation applications (e.g. web portals) are built, instead of integrating the data backends
  • IT projects overrun in time and budget
  • Business and IT run as disconnected entities

So?

Firstly, all the systems which require Customer data need to use the same data records of the entities, which are owned and maintained by the customer.

Why? Because, if I change my name now from ‘Andreas’ to ‘Andi’ through the Online banking interface, I’d like to see that change on my next account statement, too.

In order to do so, we need either

  1. Shared Data or
  2. Data Update Notifications

It is important to note that a central MDM Master is now becoming the MDM slave for the Customer owned entities (like Name and Address for example). Those MDM entities are now ‘mastered’ by the Internet Banking system.

‘Shared Data’ means there shouldn’t any Customer data, except a global identifier, be lying around anywhere else. An MDM master can be such a data repository, but making all the legacy systems using the MDM master is a whole different story.

‘Data Update Notification’ means I am receiving a notification that Customer data has been updated.
Best Practice is to send the whole updated record around instead of just the identifier. This is to save the unnecessary overhead of subsequent getCustomerData requests from all insterested systems.

This is what it means for systems to use the same Customer data. And it is where an Enterprise Architect should start to think about scalability.

Scalability

In a purist SOA/ESB way we would create a single service to set&get the latest and greatest customer data. This service would ideally be unique across the whole enterprise. The peek&idle load of those service calls from all enterprise wide service consumers would be hard to predict and you could end up with the data repository access as the bottle neck. Sure caching is an option, but then you need to think about cache location, cache-size, examine cache hit and miss counts and cache-invalidation strategies. And I’d argue then that this is not part of the core problems you should spend a large portion of your thinking on, when solving your Enterprise Integration challenges.

Remember, aim is to make the Account Statement System print ‘Andi’ instead of ‘Andreas’. If the system, which the Account Statement System uses to get the latest Customer data, is down during my update, the service call and hence my name update will never make it onto my statement. Best case senario is that I will get an error message like “Dear Valued Customer, Due to an Internal Error the requested operation cannot be completed. Please try again later”. It should perhaps read “Due to the wrong Enterprise Integration Architecture the request cannot be completed.”

Solution Architecture

How about sending a ‘Customer Details Updated’ notification to all systems that hold/store my customer data? There will be only one notification per Customer update, no unmanageable sub-sequent service call load, caching has become unnecessary and if a system is down the message will be delivered through a queue whenever the system comes back online

The thinking behind these two Solution Architecture approaches  are quite different. The SOA approach is based on the ACID paradigm whereas the EDA (Event Driven Architecture) approach is based on BASE. In case you want a refresher, Werner Vogel (Amazon CTO) has written a great article on that topic.
The ACID approach is perfectly valid in a non-distributed Application architecture, on an Enterprise Integration Architecture level things change.

Ready to hand the data- ownership over to the customer?

It seems as if the new way to go is to hand data-ownership over to customers. Customers should be happy and CIOs have to worry less about the Match&Merge step withing their MDM and Data Quality projects, at least for the Customer owned entities. However, as shown above, it is still paramount for CIOs to get the Enterprise Integration right first – and that is nothing new.

What’s your view?

Andreas Spanner

Enterprise Service Bus (ESB) – A Great…Marketing Term

When you talk to CIOs, one of their major concerns is Integration. CIOs wish everything should just magically work together.
Middle to large scale enterprises are more often than not organised in silos. Years of operation has made them so in order to get things done and to stay manageable, operational and hence in business.

To ease the pain of those poor CIOs the vendor land gang has hired some Genius Marketing Gurus to come up with a really appealing term, something that literally sells itself, and so they did: Enterprise Service Bus.

What a great concept! A single bus connecting the whole enterprise! The CIOs integration problems all solved! Where do I need to sign?
Oh, hold on, did the marketing presentation mention that the binary message formats of existing messaging solutions like TibcoRV, IBM MQ, MSMQ, SolaceSystems are not compatible?
No? Ok, the message formats are not compatible. But why would a CIO care, isn’t that binary message format waaaaay to low-level to be important.

I don’t know. What do you think?

I think it is important, because it means that there is really nothing like an Enterprise Service Bus, at least not from a product perspective, unless you do a rip & replace.
But then the CIO will need to decide which ones to rip and which one to implement and then he or she will be locked in to a single vendor – for life. Which is not pretty either.

What about the architectural concepts of an ESB then?
Let’s say we simply bridge all those different messaging systems together via adapters, then it sounds really appealing, doesn’t it? And SOA says so, too!
In a perfect world I’d say it might, but it opens the ‘can of worms’ which you would like to think about first.
And by the way those worms have names! They are called:

  • Governance & Processes
  • Team effort of IT & Business People
  • Life Cycle Management
  • Scalability
  • High Availability
  • etc

But the two biggest Challenges are less technical or implementation specific, they are of more social nature called

  • Ownership (most enterprises don’t have an Enterprise Architecture team) and
  • Cultural Change (Think Enterprise! instead of silo)

Hence a good portion of energy, time and money needs to be invested in bringing different  domain and silo owners together and they need to agree, too – all of them!

If they don’t, there will be no ‘Enterprise Service Bus’, not even as an architectural concept. And that is, because the whole Enterprise will not be connected through the same Bus and not all systems will be enabled to ‘magically work together’.

What’s your view?

Kindly,
Andreas Spanner