Knowing how your customers feel about your products is arguably as important as actual sales data but often much harder to determine. Traditionally, companies have used surveys, focus groups, customer visits, and similar active sampling techniques to perform this sort of market research.
Opposition or lack of faith in market research takes a number of forms. Henry Ford once said, "If I had asked people what they wanted, they would have said faster horses," while Steve Jobs said, "People don't know what they want until you show it to them." The real problem with market research is more pragmatic: It's difficult and expensive to find out what people think. Customers don't want to complete surveys-and the ones who do are not representative of the ones who don't, so you always are working with a skewed sample. In addition, it takes time to collate survey information, which leads to an information lag that can be fatal.
The emerging discipline of sentiment analysis may address the deficiencies of traditional market research by leveraging the wealth of data generated by social networks such as Twitter and Facebook, as well as customer comments on ecommerce sites such as Amazon.com.
Strictly speaking, sentiment analysis examines some input data to determine the author's attitude to a product or concept. The core of sentiment analysis involves natural language processing-making sense of human generated text. This processing can be as simple as counting key phrases such as "awesome" and "awful." "Unsupervised" sentiment analysis of this type-using static rules-can provide some value, but typically fails to parse ambivalent, sarcastic, or ambiguous inputs. Increasingly sophisticated techniques have been developed to more accurately parse sentiment from more complex text. Some of these solutions are developed using machine learning techniques: Initial machine sentiment guesses are validated by humans so that the algorithm "learns" to make better evaluations. In some cases, this human training is provided by crowdsourcing-farming out evaluations to labor marketplaces such as Amazon's Mechanical Turk.
Sentiment analysis as an alternative to traditional market research suits some segments particularly well. Movies are a classic case, where a sentiment analysis of movie reviews can create "meta-scores" based on analysis of dozens or hundreds of reviews. Until recently, however, sentiment analysis has been of limited use in other areas.
Sentiment analysis is now becoming a hot topic as companies attempt to mine the increasing volume of information available from sites such as Facebook or Twitter. Twitter receives more than 250 million tweets per day from more than 100 million active users. Chances are, many of your customers will be tweeting, and many of them may be tweeting about your product. And Twitter's 140-character limit makes sentiment analysis relatively simple. Facebook provides an equally rich source of information-with 1 billion users anticipated by the middle of 2012.
To see sentiment analysis in action, take a look at http://twittersentiment.appspot.com. This site allows you to generate sentiment analysis for recent tweets on any keyword. Tweets used in the analysis are listed so you can evaluate the accuracy of the algorithm. As an example, the site rates 91% of tweets on SOPA (the Stop Online Piracy Act) as negative, while 74% of tweets about "kittens" were rated positively.
Commercialization of sentiment analysis is progressing, with vendors such as WiseWindow.com offering tools that can analyze sentiment across a wide range of web sources-incorporating blogs, reviews, comments, and social networking data.
Sentiment analysis will not eliminate the need for traditional market research techniques; but, as a practical application of "big data" analysis, it's definitely a game-changer.
Guy Harrison is vice president of research and development for the Database Management business unit at Quest Software