There are good days when I think people are just unbelievably naive, then there are the bad days when I’m convinced they hate science because facts pose a threat to their schizophrenic way of life.

Have you noticed that the vast majority of web content nowadays is not about facts, it’s about the reaction of social media to facts?

Jonathan Franzen is writing a new book? We don’t talk about the book itself, we talk about people mocking him because he hates social media.

Benedict Cumberbatch is getting married? We don’t talk about important things such as will he be wearing a full morning coat? Will she be wearing a little known designer who will become her signature designer and consequently a big hit? No, we talk about the reaction of frantic Cumberbitches.

Jeanette Winterson stages a dead rabbit online? We don’t talk about her openly supporting Country Alliance politics, no, we talk about thousands of rabbit lovers insulting her on Twitter.

But how do we know that hundreds of people are making fun of Franzen, that thousands of Cumberbitches are in despair, that millions of rabbit lovers are insulting Winterson online? We know it because someone told us on the web of course.

But is it true? Well of course it must be, uh, the journalists will have checked their sources because this is what journalists do, isn’t it?

Check the sources.

Now, bear with me. The sources.

The web.

Journalists will have read the entire web, processed the data statistically, and presented it to you for your evaluation.

I hope you realize it is not possible to do such a thing, at least not yet, even with state-of-the-art technologies. Presently, big data sentiment analysis is the most challenging task of computational linguistics, not machine learning as you may think, but sentiment analysis.

If you really want to know what the web is saying about a specific subject, and you want to know it for real, factually, with sound scientific value, what you can do is buy a big data sentiment analysis engine (or code it yourself HA HA HA GOOD LUCK WITH THAT), ideally a linguistic one, configure it so that is monitors a limited number of sources under controlled circumstances, then process the data statistically, later maybe have the output arranged in a nice infographic, and finally study it to obtain objective data for your strategic purpose.

This is what you can do if you want your “oh the web says this the web says that” to actually mean something, and at that point you will probably want to do something more useful with the precious knowledge you have acquired, than giving it away on the web just for the sake social networking.

All the rest is anectodical. If it’s presented to you as anectodical then it can still be a good piece of writing if they manage to make it unbiased, otherwise when they’re telling you “this is what people say on the web” they are invariably trying to manipulate you, you gullible social media dweller.

The truth could also be the exact opposite. Most people online could actually be finding Franzen’s opinions on social networks totally irrelevant, most Cumberbitches (whatever they are) could be eager to see Cumberbatch’s groom attire, most people online could be supportive of Winterson’s pro-hunting views. The point is that there is no way to know, unless you are prepared to spend some serious money and you will still have to deal with some tolerance anyway.