There may be no more commonly used term in today’s IT conversations than “big data.” There also may be no more commonly misused term. Despite its emergence as a megatrend impacting virtually all aspects of the modern IT landscape, big data remains largely misunderstood, with a handful of widely accepted myths exerting significant influence on the way organizations perceive and attempt to address this disruptive trend.
There is a big difference between these myths and reality, and, as with any megatrend, there are so many voices talking all at once that it’s easy to perpetuate the wrong information. Here’s a look at the truth behind the five most common big data myths, including the misguided but almost universally accepted notion that big data applies only to large organizations dealing with great volumes of data.
The Top 5 Big Data Myths
1. Big data means having a huge volume of data.
“Big” is a relative term. Big data is most often mentioned in connection with large companies that have a sea of data; but, companies of every size are experiencing data growth in some form today, even if it’s moving from 5 terabytes to 10 terabytes of data. Twice as much data is still twice as much data. Data volume doesn’t have to be a certain size to be relevant. Nor is it imperative to analyze all your data at the same time anyway, so the total volume of data in your world really is irrelevant. Often, smaller subsets of your existing data can be used on its own or combined with external data to produce the desired result. The big data trend applies to all companies, regardless of how high their data pile is.
2. Only large organizations are challenged by big data.
Small organizations should be just as data-driven as large enterprises. Regardless of the size of your organization, it’s better to make decisions based on data than to simply rely on intuition or gut feelings. Smaller companies may depend on (or only be able to make) big data-driven decisions less frequently than their larger counterparts, but course corrections can be done faster based on the decision. So, while the trend is “big” data and some in turn assume “big” company, smaller companies can use best practices to be more data-driven and actually outpace or outmaneuver bigger, slower competitors.
3. Organizations need to write ultra-sophisticated algorithms to take advantage of big data.
It’s true that the initial work being done around big data is based on complicated algorithms written by data scientists; but, in reality, taking advantage of big data is really about being more data driven. The mindset and commitment to being data-driven does not require you to be on the bleeding edge of algorithms and sophisticated analysis. You can start with just a desire to better understand the data you have, and to improve the analysis of that information. With the tooling that is now available, you don’t have to have a data scientist. Vendors are producing packaged software that creates the algorithms for you, so you don’t need custom work by a data scientist with expertise in writing complex algorithms. Many of these new solutions are designed specifically for smaller organizations that don’t have those resources on staff, or huge budgets to support enormous expenditures. The capability is there for you to start small and relatively simple. Don’t think otherwise.
4. Organizations have to be great at traditional business intelligence to benefit from big data.
The traditional goals of BI were really about reporting on historical data, but you don’t have to know what happened in the past to be able to turn your eye to what’s happening right now, and what you can make happen in the future. BI reporting of the past was more static and structured, but now it’s more about patterns and relationships between data. Looking at the patterns will point you forward, so you don’t necessarily have to know what you did in the past. Practically speaking, you don’t need a good dashboarding and standard reporting framework in place to take on analysis of your data.
The misconception is that you have to get really good at doing something the old way before you can do it the new way. That’s just not true. (Did my three-year old daughter need to know how to use a landline before using my smart phone? Not at all.) You can be data driven and look for the patterns in your data without first mastering the traditional ways of BI. You don’t have to have a single massive data warehouse, or standard reporting that you process and look at once a month. This is just a different way of looking at your data, and the key is to think about it from the business and IT perspectives together. Start the way that big data requires you to start, with a business question that you need to answer.
5. Using Hadoop guarantees big data success.
Hadoop is simply a technology framework. It’s powerful and is changing many things in our world today. But, it’s not the place to start. To be successful with big data, you have to start with the business objectives. You may or may not need Hadoop to get the answers you need. You may not need new technologies at all.
Standing up Hadoop and loading up 50 terabytes of data doesn’t get you anything by itself because you still need to have a business objective to move forward. There are companies that load their data into Hadoop and then say, “now what?” That’s backwards. Start with the business objectives instead of the technology and you likely will find yourself traveling a different road.
There are many organizations that have shattered all of these myths. They include a Midwestern U.S. university where officials knew that if they could lower dropout rates by just 1%, they could drive $1 million to their bottom line. They looked at 12 different variables, including gender and the classes in which students were enrolled. By finding the patterns and relationships in the data, they were able to determine which students would be most likely to drop out. University staff then counseled those students before they got to the point of leaving school. If this midsize school had not pursued big data analysis because the officials thought it was too small or because it didn’t have a data scientist on staff, it would have missed out on this opportunity to improve its bottom line.
In another example, a major retailer started with the basic business question of how to increase sales in its stores. There were a number of approaches to consider, most of which revolved around fairly expensive ways of getting more people into the stores. But it started with how to get people to buy more once they were already in the store. So, it did some in-store analysis. The retailer mined its security video data to see what customers were doing in the store, and was able to narrow down a relationship between actual purchases and customers going into a dressing room. The pattern showed that if customers went into the dressing room, they were 50% more likely to make a purchase. So the company’s goal focused on getting more people in the dressing room. Basically, it had no need for an expensive global marketing plan. It just needed sales associates to encourage customers to try things on.
One other point worth noting is that, like many technology trends, big data tends to start with the IT department, from the bottom up, as people start playing with the technology associated with the trend. This is often the accepted way, but it’s wrong. To achieve success with big data, you need to start at the top of the organization. You need to be data-driven from the top, and trusting what the data tells you must start
at the top. In the case of the retailer, for example, traditional thinking would have included a marketing campaign to get more people into the store, or spending money to improve the product. Because management was willing to simply look at patterns and relationships in the data, the company found a much more cost-effective and impactful way to answer its business question.
So take a step back and take a breath—big data is here to stay, but it doesn’t have to be daunting. Organizations of all sizes can take advantage of what the big data trend has to offer. Instead of buying into the myths, start with a simple business question, look at the data you have, and, with the right tools and the right approach, you can turn your data into the basis for informed decisions and improved business performance.
Darin Bartik is executive director of product management for Dell Software’s Information Management solutions.