You are not feeling well, so you visit your friendly family doctor. He puts you in a new, electronic scanner and generates 28 trillion measurements of your temperature all over the surface of your body. He then saves all of these big-data measurements and, using advanced statistical algorithms and supercomputers, announces that your temperature is 98.6 degrees Fahrenheit. What a relief! Big data to the rescue.
"Sampling theory teaches that if the sample is random, one can measure the behavior or mood of the whole by talking to very few people"
The Big data Bandwagon
As the “big data” bandwagon picks up momentum, consultants, professors, conference organizers, authors, magazines, blogs, software firms, pundits, crooks, private equity firms, and computer hardware manufacturers clamor to get aboard. But is big data an accurate picture of the future, or simply a mirage shimmering in the distant desert heat? Is it the pathway to ultimate truth, or is it only a bandwagon of exaggerated promises and illusory dreams?
The truth is, the solution to marketing and business problems and the identification of strategic opportunities, often lies in the realm of little data, not big data. You don’t have to boil the ocean to determine its salt content. Most times a doctor only needs to take your temperature with a $20 thermometer, not a $10 million scanning machine.
Data You Can Trust
Often, without thinking, we tend to see all data as equal, but rarely is this true. The corporate world is awash in data. It streams in from all directions 24 hours a day, and the data deluge continues to worsen. Most large companies today have at least 100 to 200 times as much data in their collective databases as they had 30 or 40 years ago.
I would argue that most major corporations are making poorer decisions today than 30 to 40 years ago. In fact, the growing flood of data is part of the problem. More data means more confusion. What data can be trusted? Here are various types of data, ranked from most trustworthy to least:
1. Experimental Data. Carefully designed and carefully controlled experiments, conducted by objective third parties who are experts in such experiments, yield the most trustworthy data. Before-after and side-by-side controls are employed, along with sophisticated statistical analyses, to separate the noise from the signal.
2. Survey Research Data. Scientific research studies, conducted by experienced professionals who are objective third parties, yield trustworthy data. Often this data is experimental in nature. Research design, normative data, mathematical modeling, stimulus controls, statistical controls, historical experience, quality-assurance standards, etc., tend to make this data very precise. Noise tends to be minimal.
3. Marketing-Mix Modeling Data. The creation of an analytical database, the cleansing and normalizing of that data and the use of multivariate statistics and modeling to isolate and neutralize some of the noise tend to make marketing- mix modeling data better than actual sales data. The signal in marketing-mix modeling data is more stable, more reliable, and more measurable.
4. Media-Mix Modeling Data. This is the same concept as marketing-mixmodeling, just applied to a different set of variables. The same general rules apply. An analytic database, data cleansing, modeling, and statistics allow the noise in the data to be minimized, so that the effects of various media can be isolated. Again, if combined with controlled experiments, the data and analyses are much more explanatory.
5. Sales Data. Sales data are pretty good, but not perfect, measures of actual sales. But sales are not reliable and valid measures of advertising effectiveness, optimal media spending, product quality, service productivity, competitive activities, etc. Sales data can only be trusted so far. The noise often drowns out the signal.
6. Social-Media Data. Social-media data are very popular in corporate America. The data are comparatively inexpensive, often massive, and real-time (day by day, hour by hour). Many new software tools and systems make analyses of the data relatively easy. Social-media data, however, must always be viewed with suspicion and skepticism, because of the growing role of commercial content in all social media. The consumer’s voice is increasingly lost among the paid-for content.
7. Biometric or Physiological Measurements. Galvanic skin response, eye pupil dilation, eye-tracking, heart rate, EEG (brainwave) measurements, facial emotions recognition, etc., are very interesting and exciting, and they may one day open portals into the human soul, but for the present these measures are largely speculative and unproven.
Corporate decision makers often would be better served if they relied on tried-and-true tools and systems from the world of little data, rather than illusions from big data. Sampling theory teaches that if the sample is random, one can measure the behavior or mood of the whole by talking to very few people.
A sample of 200 to 300 respondents is generally sufficient to predict how much the whole population will like a new product or service. A sample of 200 users can test a new peanut butter in-home for a week, and from this it can be precisely determined if the product is optimal and what its market share will be once introduced.
Survey research is relatively inexpensive, yet very accurate, because professional researchers know the source, stimulus, context, and history -- and have tried-and-true measuring instruments, normative data, quality assurance and controls.
Marketing research can be designed to be forward-looking and predictive, rather than backward-looking. Experienced researchers can create alternative futures and measure the relative appeal of the differing visions of the future. These professional researchers can predict the sales volume of new products within narrow tolerances, based on survey research. They can optimize the formulation of a new product via product testing. They can accurately predict the effectiveness of new commercials long before they air.
They can use qualitative research methods (ethnography, depth interviews, focus groups, and online forums) to discover unmet needs and hidden dreams that can become templates for new product development.
All of this research is based on little data. The data are derived from random sampling, carefully controlled experiments, and/or scientific surveys. The sample and sampling error are known; the stimulus is known; the questions are known; the context is understood; and the meaning of the answers is known. Despite the marketing hoopla touting big data, little data often provides a more accurate basis for sound corporate decision-making.