Pi Attitude Zone: Conformity & Stability

Let’s Hear It For Small Data

It’s a seductive idea:  harness the unimaginably huge streams of data housed in the market research industry’s computers and servers, and you will magically gain insight into how the world (and everyone in it) really works.  Big Data?  Bring it on!  Then all you’ll need to apply is a set of intricate algorithms and massive computing power...  Big Data will do the thinking for you!

Lots of luck with that.  In reality, thinking about so-called Big Data is riddled with fallacies and misperceptions:  for instance the idea that merged masses of output data from different sources will be sufficiently consistent and homogeneous to yield insight. 

Try melding and merging findings from different studies and repositories of knowledge, and the first thing you will discover is that they don’t fit together at all without massive manipulation and filtering.  Different definitions, different variables, different timeframes and periodicities, different sample sizes and types, different methodologies and different data-cell configurations... together they add up to massive inconsistencies in the end-product.  Add in the scariest variables of all -- the selectivity, quality, reliability, inherent biases and error-quotients of different datasets -- and the task of putting the ‘jigsaw’ together and revealing a meaningful picture slips beyond the analyst’s reach.

Unless, that is, your approach to solving a jigsaw puzzle is to get a pair of scissors and snip off the stickey-out bits of every piece -- which is kind of what often happens in massive multi-source data-merging exercises.  Sure, the bits all fit together now, but the picture is so full of holes and misallocated pieces that it no longer means anything.  Perceived correlations within the data can be totally spurious.  In the rare cases where something coherent does emerge, it can often be nothing more than a dim and mysterious glimpse into the obvious.

A chain is only as strong as its weakest link.  And a string of data is only as meaningful as its most irrelevant or error-ridden strand.

Still convinced that size matters more than thoughtful analysis ?  Okay, wade in if you must.  But why plough headlong into Big Data when what you really wanted all along was... the Right Data?  As a general rule, the bigger the phenomenon under study, the more precisely focused the data that will explain it.

Pi says: in this, as in most other things, less is more. 

Zone: Conformity & Stability Country: Multiple Geographies Product – Business / Professional