Polling Twitter

“With the dramatic rise of text-based social media, millions of people broadcast their thoughts and opinions on a great variety of topics. Can we analyze publicly available data to infer population attitudes in the same manner that public opinion pollsters query a population? If so, then mining public opinion from freely available text content could be a faster and less expensive alternative to traditional polls. (A standard telephone poll of one thousand respondents easily costs tens of thousands of dollars to run.) Such analysis would also permit us to consider a greater variety of polling questions, limited only by the scope of topics and opinions people broadcast. Extracting the public opinion from social media text provides a challenging and rich context to explore computational models of natural language, motivating new research in computational linguistics.”

In a paper to be presented later this month to the International Conference on Weblogs and Social Media of the Association of Artificial Intelligence, Noah Smith and colleagues at Carnegie Mellon University suggest that analyzing the text found in large numbers of tweets may prove to be an inexpensive, rapid and more flexible means of gauging public opinion on some subjects than traditional public opinion polling.

Analyzing one billion publicly available Twitter messages generated in 2008 and 2009, up to seven million per day, the researchers used basic textual analysis techniques to determine relevance to the subject of inquiry, then gauged words in the messages to identify them as expressions of positive or negative sentiment.

The results were then compared with such major reputable polls as the Index of Consumer Sentiment (ICS) from Reuters and the University of Michigan Surveys of Consumers, and the Gallup Organization’s Economic Confidence Index.

The authors conclude that “we find that a relatively simple sentiment detector based on Twitter data replicates consumer confidence and presidential job approval polls. While the results do not come without caution, it is encouraging that expensive and time-intensive polling can be supplemented or supplanted with the simple-to-gather text data that is generated from online social networking. The results suggest that more advanced NLP techniques to improve opinion estimation may be very useful.”

For much more detail, see the complete 8-page paper, From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series.

Published in: on May 11, 2010 at 4:01 pm  Leave a Comment  

The URI to TrackBack this entry is: https://haysvillelibrary.wordpress.com/2010/05/11/polling-twitter/trackback/

RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: