The language of contagious disease has long infected computer science.
Decades ago, information security pioneer Len Adleman of the USC Viterbi School of Engineering applied the term “virus” to malicious code that could take over computers.
More recently, the word “viral” has been used to describe the way ideas or “memes” circulate through the Internet, with unusually popular contagious ones attaining epidemic proportions. Marketers and opinion makers of all kinds are eager to understand and control this process in order to get words or products into wide circulation.
But what disease does the spread of an idea on Digg or another social network resemble? How contagious are memes compared to, say, HIV? How do idea epidemics happen, how far do they spread and what stops them?
Kristina Lerman of USC Viterbi’s Department of Computer Science and a project leader in the school’s Information Sciences Institute (ISI), has studied patterns of information spread on Digg, Twitter and other social networking sites. In a paper she co-authored titled “What Stops Social Epidemics?” presented at the fifth International AAAI Conference on Weblogs and Social Media, Lerman delves into the similarities and differences between disease and Internet epidemics.
Social epidemics, she noted, are much easier to trace than outbreaks of disease. Working out the mathematics of disease epidemiology involved intense and unrelenting house-by-house work by public health officials and doctors tracing all the contacts of each patient. Checking spread of “like it” votes on Digg or retweets on Twitter is more automatic and much easier.
Lerman defined “infected” as a social media user who posts or recommends some content to his or her followers. Then by retweeting the content or voting for it on Digg, the followers themselves become infected and go on to infect others. This mechanism allowed Lerman and her colleagues to trace infection and information flow.
“We found that social epidemics look and spread very differently from diseases,” Lerman said. “Contrary to the expectations raised by the disease analogy, the vast majority of information cascades failed to reach ‘epidemic’ proportions. Rather than propagating to thousands or tens of thousands of people, as would be expected of a viral outbreak, most of the information cascades on Digg reached just hundreds of people.”
Why? The answer, according to Lerman and her team, appears to lie in the fundamental difference between exposure to disease and exposure to, say, jokes.
In disease, repeated exposure to multiple carriers increases the likelihood of infection, often drastically. In social networking, this isn’t the case. Social networkers aren’t more likely to repeat a tweet or like a Digg if eight or 10 of their contacts like it than if only one or two do. In fact, the reverse is true.
“The fundamental difference between the spread of information and disease is, despite multiple opportunities for infection within a social group, people are less likely to become spreaders of information with repeated exposure,” Lerman wrote.
“In the end, there may be a good reason for this difference,” Lerman said. “Imagine if ideas and information did indeed spread like viruses. We would all be drowning in information.”
Greg Ver Steeg, postdoctoral researcher at ISI, and USC Viterbi computer science graduate student Rumi Ghosh collaborated with Lerman on the paper.
Lerman has posted a nontechnical exposition of the research at bit.ly/LermanResearch
The work was funded by the National Science Foundation and the U.S. Air Force.