One of the most frequent demands business executives have of marketing is to provide normalized and predictable results. You do xyz per number of times or per volume and you get abc as a result. My observation is that things are no longer that predictable, if they ever were.
For companies that think mainly in terms of putting it all on a spreadsheet and a PowerPoint presentation, social media has been a major source of consternation. It has more variables and dimensions than a 2D medium, and needs more people who can think of their feet.
Watching this short (less than 6 minutes) and funny TED Talk by Sebastian Wernicke, I found two very interesting observations the speaker himself made in response to comments that address the issue of statistics and their analysis, which may help you think about your next social media project.
In response mainly to the assertion about statistical correlations saying nothing about causation, as shown in the video, Wernicke writes:
Never trust statistics - ever! However, I think there are at least three parts in the analysis where causation is actually plausible (then again, maybe I have spent too much time with the subject matter and start to see patterns everywhere):
1. The picture shown at 1:11 in the video is an actual correlation mapping between audience ratings. I think it makes sense that the general direction of the topic (rational vs. emotional, actions vs. ideas) should spark specific audience reactions.
2. The picture shown at 1:56 is derived from a semantic analysis (where words are automatically grouped into topics by a software tool). I think it makes sense that there is a tendency to rate those talks as your favorite that you can easily connect with emotionally.
3. Regarding the four word phrases at 3:03, it seems to me that those appearing in the most favorited TED talks are much more audience-centric than those in the least favorited TED talks.
Yesterday we talked about how comments, ratings, endorsements, likes and votes are part of collaborative filtering. As you go about creating content, uploading customer success stories, building communities, you will also provide reports to quantify measurement and results of such activities.
Could you in fact achieve better results by tweaking the language to please the community?
If you want to learn how the data was gathered, Wernicke explains a little further down in the comments:
Most parts of data gathering and analysis were a combination of Linux scripts and a (large) spreadsheet. However, two parts of the text analysis required special linguistic analysis software:
1) The top-10 word list (you need the tool to "normalize" words so that, e.g., different verb forms will be counted as the same word).
2) The most-favorite and least-favorite topics. This is based on a so-called "semantic analysis", where words are automatically grouped into a (manually curated) topic structure.
Text analysis in relations to stock price movements is in fact already being done by several financial institutions, with computers automatically interpreting and trading on news they receive via agency tickers (e.g., see http://en.wikipedia.org/wiki/Algorithmic_trading#Issues_and_developments).
He used open source tools to compile the data. Is this something we could start leveraging to operate more complex analysis of linguistic data? I'm thinking about sentiment analysis in particular. As an aside, I found it fascinating that a whole
conversation ensued in the comments about TED Talk comment ratings. Is anyone capturing that feedback?















Thanks Valeria,
While this totally challenging the thinking behind one of the few things- statistics - that clients venture to hold on to, it clarifies their relative importance. these are tough waters for clients with stretched budgets to negotiate, but such insights are invaluable.
Many thanks,
Simon
Posted by: Simon Mainwaring | May 07, 2010 at 07:27 PM
Valeria,
I've been giving this alot of thought lately myself. I'm wondering if there is a practical need for statistics in the social media space.
I would use stats to "predict behavior" when I didn't have reliable access to the population being sampled. But, with Social media - I can quickly understand exactly what my community prefers and get a reasonable impression of how they will react.
I wonder what I lose by not having accurate predictive models that dissect and analyze my audience. Hmm...
Maybe a deep dive into the science of campaign polling may yield some interesting answers -
Posted by: Stanford | May 08, 2010 at 08:17 PM
@Simon - we're too quick at giving more importance to one overarching strategy like lead generation, at the detriment of others like customer retention and brand conversations. Why? Because the direct marketing people come to the table with all sorts of statistics and projections. Time to start challenging those certainties. In the same way direct displaced mass advertising, intimate and relevant in a personal and humanized way will (hopefully) displace the attitude of entitlement and pushiness of direct/lead gen tactics.
@Stanford - predictive modeling is one of the things integrated marketing groups should look at, not the only one. And yes, marketers need to stop looking at customers behind a one way glass window, even online, and start talking with them. Interesting thought about the science of campaign polling. Looking forward to reading what you come up with. I'm familiar with the work of Frank Luntz.
Posted by: Valeria Maltoni | May 09, 2010 at 02:21 PM