#iranelection Censored? Evaluating Twitter's Trending Topics
Written in collaboration with Devin Gaffney.
Following the anniversary of the Iran election protests on June 12th, Iranian reform protesters began accusing Twitter of censoring #iranelection when the hashtag did not trend on the site that day. Some protestors went so far as to create a Twitition calling for Twitter to "Get Rid of Censorship for #Iranelection", signed by 110 signatories (as of June 18th, 2010).
Justifying their claim for censorship, the Twitition claimed that #iranelection was surpassing the number of tweets for the top Trending Topics yet still not appearing on the list:
"That day the top Trending Topics included FIFA World Cup and #worldcup as well as the oil crisis and some media stars. However, as we researched further we found that #iranelection WAS PRODUCING UP TO TEN TIMES THE AMOUNT OF TWEETS PER MINUTE THAN ANY OF THE OTHER SO-CALLED TOP Trending Topics!!"
#iranelection vs. #WorldCup vs. Bieber
To investigate the issue, we ran some informal testing, comparing the number of tweets between #iranelection, #worldcup and Bieber (as a contemporary high Twitter volume constant) to see how they compared.
A two-hour long streaming API data collection was performed for the three terms on the early afternoon of Friday, June 18th for a duration of two hours. To be fair, this was a full 6 days following the Iran Election's anniversary, so data may be significantly different - what is clear, however, is that the World Cup is getting much more attention on Twitter than the Iran Election. Additionally, whereas many people seemed to be talking World Cup, only a few people participated in #iranElection tweets, and did so very vocally. As a general overview, the data suggest that the claims in the Twitition claims are likely an exaggeration, and in fact, World Cup deservedly received Trending Topic status. Here's a summary of our results:
- #WorldCup: 33,166 Tweets spread across 23,396 Users
- Bieber: 10,730 Tweets spread across 7,101 Users
- #iranelection: 897 Tweets spread across 133 Users
Essentially, this suggests a small number of persons employing #iranelection many times over, and a much larger organically generated #WorldCup population tweeting much more heavily than even Bieber fans themselves. In a much more detailed review, the data clearly suggests that the claims made in the Twitition are largely groundless based on metrics such as account creation dates and time-zone distributions, as well as raw data volume and the nature of the tweets themselves (There was a surprisingly low occurrence of "conversational" tweets (either "re-tweets" or "direct mention" tweets) in the #iranelection data set).
How Do Trending Topics Get Decided?
Trending Topics, for those unfamiliar with the term, are the top 10 words (or phrases) being discussed on Twitter at any point. Twitter did not actually include Trending Topics until April 30th, 2009, a few months after their purchase of the analytics company Summize, Inc. Trending Topics are a result of Summize's initial algorithm and the subsequent improvements. The meanings behind those topics is drawn from WhatTheTrend, which crowd-sources definitions for Trending Topics from its user base. Words like "the" and "and"...etc are excluded, as their volume is likely the highest. Until recently, the Trending Topics favored raw volume for a topic over the immediacy of that volume - if, for example, the term "J-Biebs" had a constant rate of 10,000 tweets per minute, it would appear higher in the list of candidates for Trending Topics than something like "Earl of Sandwich", which for one reason or another jumped from 1 tweet per hour to 5,000 tweets per minute over the course of five minutes.
As of May 14th, the Trending Topics are no longer just the simple volume of traffic - although the algorithm itself is not disclosed, it is now more dependent on the immediacy or dynamic change of data over time. In Twitter's words, "Twitter is about what is happening right now," and "the hottest emerging trends and topics of discussion on Twitter are the most interesting."
As a product of unfortunate timing, the gigantic and all-powerful Bieber fan population, noticing the change in the algorithm, accused Twitter of censoring Bieber-based Tweets. This shift in the algorithm was to blame, Twitter argued, and this example demonstrated the potentially new ways to think of Trending Topics. Since then, Twitter users have been crying foul when the Trending Topic list does not include their topic, such as #iranelection or when Twitter was accused of censoring #Flotilla when it began to trend on May 30th, 2010
However, these concerns are not unfounded. Twitter does intervene to remove some topics, such as racist Trending Topic #thingsdarkiessay in November 2009, a move which reportedly angered some South African users.
So Then What Topics Get Trended?
In the scope of the data collection analyzed above, we could then posit some other possible factors into a trending topic status - volume alone does not likely lead to Trending Topics, but the organic volume as a result of a broad user base. If, for example, a few accounts suddenly pushed numerous Tweets into the system as fast as possible, all with some similar term, even if they looked completely organic, they would probably be omitted or at least weighted down by some factor of the number of users talking about that subject. In the case of the #WorldCup conversation, the relatively even ratio of Tweets to users is more suggestive of a dynamic conversation. The more imbalanced ratio in the #iranElection conversation, then, is more suggestive of a conversation guided by a few users with a clear agenda to push more traffic for the given topic.
Arguably, more concerning then potential political filtering by Twitter is the increased commercialization of a social network that has prided itself on being "a platform for you to influence what’s being talked about around the world". Recently, Twitter began considering letting advertisors insert their own topics into the Trending Topics; essentially selling Trending Topic space so advertiser's terms could emerge as popular topics on Tweeter's homepages and sidebars. This, in essence, would give the highest bidder the power to determine what trends on Twitter, suggesting that Twitter is open to tampering with the Trending Topics algorithm under certain circumstances. Twitter already offers advertisers Promoted Tweets, where relevant advertiser's tweets are pushed to the top of the list when a user does a search on Twitter.