Predictive Analytics - sometimes referred to as Predictive Data Mining - is a branch of Business Intelligence that attempts to use historical data to make predictions about future events. At its simplest, predictive analytics utilizes statistical techniques, such as correlation and regression, which many of us have encountered in college or even high school. Correlation analysis determines if there is a statistically significant relationship between two variables. For instance, height and age are highly correlated, while IQ and height are very weakly correlated. Regression attempts to find an equation between the two or more variables, so that you can predict one from the other.
More complex mathematics - including multivariate techniques (correlating many variables at once), non-linear methods, fuzzy logic and even neural networks - are used to generate more complex probability trees and create more complex predictions.
These sorts of techniques have been commonplace in the social sciences for a very long time; but, it's only relatively recently that they have found practical application in the business world. Increasingly, predictive analytics is being used for critical business purposes, such as:
- Predicting when a customer might be about to move to a competitor
- Predicting if a customer would be a likely up sell opportunity
- Predicting loan default rates and identifying customers in danger of defaulting
- Predicting future stock prices
- Identifying fraudulent activities by identifying transactions that deviate from predictions
(For instance, credit card transactions that deviate from past behaviors can be flagged and investigated for possible fraud.)
- Determining risk of illness either to determine insurance premiums or - more helpfully - to arrange preventative medical programs
- Using your data of birth to determine your personality type (No, sorry, that's astrology!)
These activities all directly affect profitability and revenue generation. Consequently, predictive analytics is experiencing rapid adoption and - if you believe the marketing hype - a new style of predictive enterprise is emerging. In the predictive enterprise, predictive analytics allows companies to become more proactive, agile and responsive to customer needs.
Not surprisingly, vendors who put the heaviest focus on statistical analysis have the most advanced predictive analysis techniques - including companies such as SAS and SPSS. These statistically-oriented companies often are found in partnership with more mainstream Business Intelligence vendors, such as Cognos and Business Objects, who are trying to leverage the increasing interest in predictive technologies and integrate their data warehousing assets with predictive methods. However, these alliances may weaken as the BI and ERP vendors develop more native predictive technologies.
Although predictive analytics offers a great deal, relying too much on predictive models can be unwise. A form of predictive analytics was used by financial institutions to predict the risk - and, hence, the value - of the complex derivatives that were implicated in last year's disastrous financial meltdown. Unfortunately, the predictive analysis was based on the assumption that the future would be like the past, with ever-increasing home values and very low mortgage default rates. When these assumptions failed, the predictive analysis became little more than a sophisticated example of the old computer science adage: Garbage In, Garbage Out. Triple-A mortgage-backed securities rapidly transformed into toxic assets, and the world entered the most severe economic downturn of our generation. The lesson: even the most sophisticated model is based on assumptions, and these assumptions need to be understood and validated.