A leading supplier of data integration software for businesses, finding that its developers were spending too much time grappling with data management inside each of its products, adapted its architecture to a service-oriented architecture (SOA) and built its own data services platform (DSP). However, problems arose that required a complete rebuilding of the architecture. The underlying cause of those problems? Poorly architected data access.
This company's experience is unfortunately all too common. Many organizations are learning the hard way that SOA projects cannot succeed without cleaning up data quality and definition issues. Also, that data access must be factored in at the planning phase as a component in the foundational architecture of an SOA.
No question about it: SOA is here to stay. Nearly every company with a sizeable IT organization is embracing it as an important enabler of business transformation. SOAs are being implemented for purposes such as using Enterprise Service Busses for message delivery, communicating with outside vendors through web services for supply chain management, providing customer service portals, exposing legacy mainframe assets as web services and building new applications by deploying business functionalities as services.
The problem in many cases is that data system architects are falling back on traditional approaches in trying to solve new types of challenges. The data access in their SOAs is layered on top of traditional data access APIs-ODBC, JDBC, ADO.NET, etc. -that all essentially enable applications to connect to data sources, issue queries, and receive data back. They are employed to provide fast, scalable access to relational data and to repurpose business knowledge residing in legacy systems.
In a nutshell, the characteristics of traditional data access APIs are at odds with the characteristics of a well-designed SOA. Ideally, a SOA is loosely coupled to data sources, stateless, often disconnected (as in web services), and enables synchronous/asynchronous access. Contrast this with traditional data access, which is tightly coupled to data sources, involves a complex state machine, is connection-based, and largely synchronous. SOA accesses information from aggregated views of multiple sources, and is agile in that services reuse is encouraged across the enterprise. It allows reuse of existing software assets and provides an architecture for disparate IT systems to meet the goals of abstracted business processes and programming paradigms. The traditional data access strategy, on the other hand, is to write code to be deployed in an application-specific data silo against a single data source.
So how is this challenge addressed? The recommended approach is to implement a data services layer incorporating the Business Object Model (BOM) concept in common data models-that is, objects that mediate and/or abstract a data service away from the underlying complexity of heterogeneous data sources, programming languages, platforms, and architectures. So data services are virtual data entities aggregated from sources across the enterprise. The data services approach offered by some vendors removes the need to construct workflows or code by hand, enabling the automation of data service creation and maintenance. Consumers of data services can include portals, business processes, other data services, and web applications.
Emerging as the industry standard for data access in a SOA is Service Data Objects (SDO). SDO is a simplified programming model for application developers that allows heterogeneous data to be accessed and manipulated in a uniform manner through data services, thus providing a unified and consistent data access methodology for heterogeneous data sources. Originally developed as a joint collaboration between BEA and IBM as a means of simplifying the J2EE data programming model and giving J2EE developers more time to focus on the business logic of their applications, version 2.0 of the specification was introduced in 2005 as part of a collaborative standards consortium consisting of many leading ISVs. SDO 3 is now standardized within the OASIS consortium as other standards for web services.
With SDO, familiarity with a technology-specific API is not necessary in order to access and utilize data. Developers need to know only one API-the SDO API-which allows them to work with data from multiple data sources, including relational databases, XML documents, web services, the Java Connector Architecture, mainframe transactions, packaged applications, EJB components, and others.
SDO provides a framework to which applications can be contributed, with those applications all being consistent with the SDO model. The SDO framework also includes a good number of J2EE patterns and best practices, like Data Transfer Object (DTO) for instance. These make for easy incorporation of proven architecture and designs into an application. A good example is web applications. Most today are not, nor can be, connected to backend systems all of the time. For this reason, SDO supports a disconnected programming model.
XML, too, is a key driver of SDO and is supported as well as integrated into the framework. XML Schema (XSD) is frequently used to define business rules in an application's data format. And XML itself is critical to facilitating interaction in web services, which use XML-based SOAP as the messaging technology.
Incorporating SDO in the architecture of a data services platform is an excellent approach employed by several vendors currently offering data services platform software on the market. Whether an organization's IT division decides to go with a commercially available third-party solution or build its own DSP, several critical factors must be considered in view of the organization's particular needs. These include heterogeneity, read/write data access that embraces all types of applications and that supports the common data model, and enterprise-class scalability, performance, security, and manageability.
Of primary concern for organizations having any significant size or reach is heterogeneity. The DSP should be able to support all application interfaces, all data sources, and all technology platforms including legacy mainframe (where that exists). If they opt for a third-party DSP, such organizations need to watch carefully for certain drawbacks here. Many offerings that fall under the category of Enterprise Information Integration (EII) software, for instance, are mostly read-only and frequently offer limited breadth of data source or platform support. This limitation in breadth applies as well to several pure-play SDO solutions available on the market, which frequently are not truly enterprise worthy.
While the DSP offerings of the large and well-established platform vendors might make sense for organizations heavily invested in them, here, too, wariness is called for regarding vendor lock-in and the tendency of such vendors toward bias in favor of their own particular technology environment.
Keep in mind that with SOA the idea is to allow re-use of existing software assets, provide an architecture for disparate IT systems, and meet the goals of abstracted business processes, programming paradigms. The only bias in play here should be a bias toward avoiding vendor lock-in like the plague.