Skip to main content

Twitter to Release All Tweets to Scientists: A Trove of Billions of Tweets Will Be a Research Boon and An Ethical Dilemma

A trove of billions of tweets will be a research boon and an ethical dilemma

Five hundred million tweets are broadcast worldwide every day on Twitter. With so many details about personal lives, the social media site is a data trove for scientists looking to find patterns in human behaviors, tease out risk factors for health conditions and track the spread of infectious diseases. By analyzing emotional cues found in the tweets of pregnant women, for instance, Microsoft researchers developed an algorithm that predicts those at risk for postpartum depression. And the U.S. Geological Survey uses Twitter to track the location of earthquakes as people tweet about tremors.

Until now, most interested scientists have been working with a limited number of tweets. Although a majority of tweets are public, if scientists want to freely search the lot, they do it through Twitter's application programming interface, which currently scours only 1 percent of the archive. But that is about to change: in February the company announced that it will make all its tweets, dating back to 2006, freely available to researchers. Now that everything is up for grabs, the use of Twitter as a research tool is likely to skyrocket. With more data points to mine, scientists can ask more complex and specific questions.

The announcement is exciting, but it also raises some thorny questions. Will Twitter retain any legal rights to scientific findings? Is the use of Twitter as a research tool ethical, given that its users do not intend to contribute to research?


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


To address these concerns, Caitlin Rivers and Bryan Lewis, computational epidemiologists at Virginia Tech, published guidelines for the ethical use of Twitter data in February. Among other things, they suggest that scientists never reveal screen names and make research objectives publicly available. For example, although it is considered ethical to collect information from public spaces—and Twitter is a public space—it would be unethical to share identifying details about a single user without his or her consent. Rivers and Lewis argue that it is crucial for scientists to consider and protect users' privacy as Twitter-based research projects multiply. With great data comes great responsibility.

Melinda Wenner Moyer, a contributing editor at Scientific American, is author of How to Raise Kids Who Aren’t Assholes: Science-Based Strategies for Better Parenting—from Tots to Teens (G. P. Putnam’s Sons, 2021). She wrote about the reasons that autoimmune diseases overwhelmingly affect women in the September 2021 issue.

More by Melinda Wenner Moyer
Scientific American Magazine Vol 310 Issue 6This article was originally published with the title “Twitter Opens Its Cage” in Scientific American Magazine Vol. 310 No. 6 (), p. 16
doi:10.1038/scientificamerican0614-16