Politics, movies, television, baseball and revolutions. As part of its emphasis on participatory culture, USC’s Annenberg Innovation Lab (AIL), based at the USC Annenberg School for Communication and Journalism, has developed a project that offers real-time analysis on topics that thrive via social media.
The Twitter Sentiment Analysis index has been used to mine the positive and negative sentiment of 40 million tweets and revealed insights about the international conversation on everything from the Arab Spring revolutions to the U.S. presidential election.
With the help of IBM software called InfoSphere Streams, AIL has worked with researchers from the USC Viterbi School of Engineering’s Signal Analysis and Interpretation Laboratory to develop a lexicon and advanced language algorithms that, in effect, teach a computer to understand the true sentiment behind the mini-messages broadcast by Twitter.
The trick has been helping the software “learn” the difference between enthusiasm and sarcasm, Professor Jonathan Taplin, director of AIL, told an audience gathered for “GLIMPSE: A Digital Technology Showcase” on Jan. 29.
“Sarcasm is not something computers understand very well,” said Taplin, whose political sentiment team was led by USC Annenberg Professor François Bar and Shri Narayanan, both research fellows at AIL.
Lab researchers used the tool to analyze Twitter during the entire 16-month presidential election cycle, from the beginning of the primaries to election night. Over that time, they continued to refine the technology to make it more accurate.
“When we first started, Michele Bachmann was the flavor of the week. Someone said, ‘I’m so happy Michele Bachmann threw her tinfoil hat into the ring.’ The computer, of course, thought this was very positive for Bachmann,” Taplin said, drawing laughs from the technology journalists, USC faculty and supporters who gathered for the event.
The lab then brought in more students, friends and observers to meet the challenge of parsing sarcasm. Researchers developed an online tool they could use to annotate tweets individually, to correct the computer and help it learn more about language patterns. They used Amazon Mechanical Turk, a crowdsourcing marketplace, to correct the analysis of thousands of tweets.
“We think we learned an awful lot about sarcasm,” Taplin said. For example, “When someone puts one word in quotes, it probably means just the opposite. Learning this and learning emoticons have helped us refine the work.”
Click here to see a screenshot of the Sentiment Analysis tool at the close of the election.
The lab is most excited about the real-time power of the Sentiment Analysis tool, Taplin said. During the presidential debates, the tool analyzed 400 tweets per second
“As soon as ‘binders full of women’ came up, we added it to the key words. You could follow it coming out of nowhere. … That’s the fun thing now, to try and understand more,” Taplin said. “This is like a 10 million-person focus group. Watching these meters go up and down in real time — especially during the debates — was unbelievable.”
The tool has also been used to examine World Series conversation versus TV ratings, viewer engagement during the Academy Awards telecast and to predict movie box office draws.
Next up, researchers want to use what they learned while analyzing the Oscars broadcast and apply it to television. Over the next few months, they’ll be watching reality TV and gauging engagement by watching social media. It could offer more insight than Nielsen ratings, which are a snapshot of how many televisions are tuned in but not how many people are watching.
The sentiment tool, in contrast, shows you “the exact moment when people are engaged,” Taplin said.
“If the sentiment turns bad, you could look back at minute 36 and see what went wrong,” he added. “We think this is a tool producers would like.”