The adage “every company is a data company” is truer today than ever. The problem is most companies don’t realize how much valuable data they’re actually sitting on, nor how to access and use this untapped data. Yet, the only way to get the insight necessary to continuously improve a business these days is to leverage every shred of information available. Companies must exploit whatever data enters their enterprise in whatever format and from whatever source to gain a comprehensive view of their business.
The problem is that most IT professionals focus all their resources on figuring out how to access structured data sources effectively. Projects associated with data warehousing and traditional business intelligence get all the attention. And, in some cases, they yield valuable insights into the business. But the fact is that structured data sources are just the tip of the iceberg inside most companies. There is so much intelligence that goes unseen and unanalyzed simply because they don’t know how to get at it.
For that reason, forward-looking CIOs and IT organizations have begun exploring new strategies for tapping into other non-traditional sources of information to get a more complete picture of their business. These strategies attempt to gather and analyze highly unstructured data like websites, tweets and blogs to discover trends that might impact the business.
While this is a step in right direction, it misses the bigger picture of the big data landscape. The “blind spot” in these data strategies concerns both the unstructured and semi-structured data that is contained in content like reports, EDI streams, machine data, PDF files, print spools, ticker feeds, message buses, and many other sources.
Understanding the content blind spot
A growing number of IT organizations now see value in information contained within these content blind spots. The key reason: It enhances their business leaders’ ability to make smarter decisions because much of this data provides a link to past decisions.
Companies also realize that these non-traditional data sources are growing at an exponential rate. They have become the language of business for industries like healthcare, financial services and retail. For example, healthcare organizations are buried in mountains of data that they urgently need to access and report on for financial, clinical and regulatory reasons. And the challenge is getting worse with the growing number of required quality metrics and clinical data exchanges.
So where do you find these untapped sources of information? Easy - they’re everywhere. As companies have rolled out ERP, CRM and other enterprise systems, they have also created thousands of standard reports. Companies are also stockpiling volumes of commerce data with EDI exchanges. Excel spreadsheets are ubiquitous as well. And as PDF files of invoices and bills-of-lading are exchanged, vital data is being saved. All these sources possess semi-structured data that can reveal valuable business insight.
Another content blind spot is the information accumulating in enterprise content management (ECM) systems. They track corporate history and statutory reporting to offer a historical perspective of business.
But how do you get to these sources, what do you do with them?
Optimizing information and visual analysis
A variety of new software technologies are emerging that enable businesses to tap into unstructured, semi-structured and structured data simultaneously. The goal is to enable next-generation analytics of any data variety, regardless of structure, at real-time velocity for fast decision making in a visual data discovery environment.
Image courtesy of Shutterstock