Social media data can give us great insights – how public opinion is forming and changing, how issues are breaking, activity and behaviour patterns, the way messages are being received by consumers.
It’s easy to think that social media offers up an unlimited treasure trove of accurate data. But currently, the raw data from most social media platforms has some serious limitations if taken purely on their own merits, without applying human intelligence to interpret it.
Geotargeting data, for example, can be unreliable and lead to wastage, as people connect via a proxy or VPN (do you know where your local coffee shop’s internet connection routes through?) that could show them as being in California when they’re actually in Clapham.
Data volumes are limited, too: you can only access data through some platforms for a limited number of years which means you’re only ever seeing a sample. Each network has a demographic – and even a political – skew, which means you could be benchmarking against data that has an inherent bias.
Accessing data through social analytics tools can be more accurate, if they store (and segment) data over longer periods of time, but even then you could be analysing what is effectively a subset of a data set. And we know that, at the moment at least, sentiment analysis is notoriously inaccurate.
But the volumes of data you can get from social media still make this a worthwhile source of information. If you know and understand its limitations, you can make allowances in the insights you produce from that data. To do that, you need a human analyst to check for inconsistencies that might show, for example, whether information has been faked. That might be because it relates to contentious or politically charged subject matters where it might not be in someone’s interest to give real information (people’s voting intentions, for example). So in some ways, inaccuracies and misinformation in social media data actually reflect the society we live in.
Ultimately, we have to recognise social media data as a sample, not an end in itself. If we understand its role and context, and use it as part of a bigger data set, drawn from different sources, it can be hugely valuable. What’s clear is that we need human intelligence to interpret it, with all its flaws, and extract meaningful insight.