We All Need Data Literacy

We don’t usually see data, we see its interpretation. It’s on us to train our brains to look for and examine the data and remove underlying assumptions.

When I was in college, I was genuinely surprised at the number of my fellow students who struggled with basic media literacy. Professors would repeatedly remind students to look at primary and secondary sources and not just turn to AltaVista (or my personal favorite MetaFilter, yes I’m that old) for primary sources. My how times have changed.

In 2006 social media sites like MySpace and the then new Facebook were about interacting with people online and were largely innocuous platforms, and not focused on sharing news. The rampant distribution of news sources via Facebook and Twitter in the past 3 years has exacerbated the need for robust media literacy training. Unfortunately, we as a population are failing miserably at even basic media literacy. Users distribute material on social media without examining it’s origin and then upvoting/liking that media without even reading the article. Some “publications” are obfuscating their bias through clever branding. And this is true on both the Left and the Right here in the US. For every Media Research Center on the Right there’s a Huffington Post on the Left, each proclaiming their truth through data. A large part of what’s propping up the distribution of these dubious sources is the manipulation of that data, because data shows the authors intent. The multiplier on this problem is that likes and upvotes on articles can be bought, thus increasing the relative impact of the shared new item.

In our society we take solace in fiction of the objectivity and truth of data. Throughout our lives we constantly hear the old maxim, “numbers don’t lie.” When you think about it, we rely on data for nearly everything in our lives. From requesting a raise at work to which TV set you’re buy every argument is made stronger through data. If you can prove sales increased 15% under one of your plans then you deserve that extra 6% damnit! The TV with the highest refresh rate and PPI will win the shootout. The car with the most horsepower will win the hearts and minds. You will be lured into believing the veracity of a claim if there’s an accompanying chart. Because charts mean data! And data is king! All hail Lord Data!

Seriously, Scott Adams said it best:

The problem here is something that our current state of mind and reliance on visual media won’t let us believe, but I’m going to say the thing that others won’t. Data, at its core, is meaningless. Data requires analysis to gain meaning. Much like algorithms reflect the bias of the humans who develop them, data analysis reflects the bias of the data scientist. The same data set can be used to prove both sides of a given coin by omission or careful storytelling. Anyone who has had to make a presentation based on Google Analytics, MixPanel or any other web/app analytics program knows the power of selecting the data to present. As Jennifer L. Aaker explains in Persuasion and The Power of Story:

It’s emotion driving the decision, and we rationalize the decision afterward.

Effective communicators use data to reinforce a story and is no lack of resources to help marketers tell their story through data. And of course it’s always possible to tell a story from a single datapoint.

via  XKCD

via XKCD

From 2014–2016 I was the head of digital marketing at YouGov, a market research / polling company. There I learned the true power of data as well as what I would consider the moral way to analyze data and the stories around it. Being surrounded by pollsters, data scientists and other masters of enormous data set analysis taught me a skill that I thought I had, but had sorely oversimplified — data literacy. My time at YouGov was spent with analysts who taught me the virtue of looking at data and extracting a story as opposed to than cherry picking data to support my story.

We’re moving towards a future of data. Thanks to developments in materials engineering I can go to Amazon.com and buy a 4TB hard drive for less than $100. My first computer (a hand-me-down IBM 286 XT from my parent’s office) had a whopping 10MB and we all thought that was more drive space than we’d ever need. That’s a 400,000x increase in just 30 years or so.

Data will be used to help form our opinions and it’s already being used to sway public opinion. As we move towards the future it’s important that we all learn a little bit of data science. Learn how to locate the raw data that’s being presented to you and become critical of the underlying assumptions. Learn how to create a good survey/poll so that you can readily identify the bad ones. Harvard University has a really wonderful one-sheet for their Program on Survey Research that everyone should read.

The unintentional result of the 2016 US Presidential election is that more people than ever see the need for strong media and data literacy. But the influx of data and the need for data literacy doesn’t have the same kind of lead time. Get out there and become a truly informed consumer of news and data.