Feb 16, 2017 7:00 AM

What News-Writing Bots Mean for the Future of Journalism

The Washington Post's Heliograf software can autowrite tons of basic stories in no time, which could free up reporters to do more important work.

This image may contain Bird Animal Hand Human and Person

This story is part of our special coverage, The News in Crisis.

When Republican Steve King beat back Democratic challenger Kim Weaver in the race for Iowa’s 4th congressional district seat in November, The Washington Post snapped into action, covering both the win and the wider electoral trend. “Republicans retained control of the House and lost only a handful of seats from their commanding majority,” the article read, “a stunning reversal of fortune after many GOP leaders feared double-digit losses.” The dispatch came with the clarity and verve for which Post reporters are known, with one key difference: It was generated by Heliograf, a bot that made its debut on the Post’s website last year and marked the most sophisticated use of artificial intelligence in journalism to date.

When Jeff Bezos bought the Post back in 2013, AI-powered journalism was in its infancy. A handful of companies with automated content-generating systems, like Narrative Science and Automated Insights, were capable of producing the bare-bones, data-heavy news items familiar to sports fans and stock analysts. But strategists at the Post saw the potential for an AI system that could generate explanatory, insightful articles. What’s more, they wanted a system that could foster “a seamless interaction” between human and machine, says Jeremy Gilbert, who joined the Post as director of strategic initiatives in 2014. “What we were interested in doing is looking at whether we can evolve stories over time,” he says.

After a few months of development, Heliograf debuted last year. An early version autopublished stories on the Rio Olympics; a more advanced version, with a stronger editorial voice, was soon introduced to cover the election. It works like this: Editors create narrative templates for the stories, including key phrases that account for a variety of potential outcomes (from “Republicans retained control of the House” to “Democrats regained control of the House”), and then they hook Heliograf up to any source of structured data—in the case of the election, the data clearinghouse VoteSmart.org. The Heliograf software identifies the relevant data, matches it with the corresponding phrases in the template, merges them, and then publishes different versions across different platforms. The system can also alert reporters via Slack of any anomalies it finds in the data—for instance, wider margins than predicted—so they can investigate. “It’s just one more way to get a tip” on a potential scoop, Gilbert says.

The Post’s main goal with the project at this point is twofold. First: Grow its audience. Instead of targeting a big audience with a small number of labor-intensive human-written stories, Heliograf can target many small audiences with a huge number of automated stories about niche or local topics. There may not be a wide audience for stories about the race for the Iowa 4th, but there is some audience, and, with local news outlets floundering, the Post can tap it. “It’s the Bezos concept of the Everything Store,” says Shailesh Prakash, CIO and VP of digital product development at the Post. “But growing is where you need a machine to help you, because we can’t have that many humans. We’d go bankrupt.”

Rise of the Newsbots

Three more AI-powered tools for journalists. —Greg Barber

Wibbitz

USA Today has used this AI-driven production software to create short videos. It can condense news articles into a script, string together a selection of images or video footage, and even add narration with a synthesized newscaster voice.

News Tracer

Reuters’ algorithmic prediction tool helps journalists gauge the integrity of a tweet. The tech scores emerging stories on the basis of “credibility” and “newsworthiness” by evaluating who’s tweeting about it, how it’s spreading across the network, and if nearby users have taken to Twitter to confirm or deny breaking developments.

BuzzBot

Originally designed to crowdsource reporting from the Republican and Democratic National Conventions, BuzzFeed’s software collects information from on-the-ground sources at news events. BuzzBot has since been open-sourced, portending a wave of bot-aided reporting tools.

Prakash and Gilbert take pains to stress that the system is not here to usher reporters into obsolescence. And that brings them to the second objective of Heliograf: Make the newsroom more efficient. By removing tasks like incessant poll coverage and real-time election results from reporters’ plates, Heliograf frees them up to focus on the stories that actually require human thought. “If we took someone like Dan Balz, who’s been covering politics for the Post for more than 30 years, and had him write a story that a template could write, that’s a crime,” Gilbert says. “It’s a huge waste of his time.”

So far, response from the Post newsroom has been positive. “We’re naturally wary about any technology that could replace human beings,” says Fredrick Kunkle, a Post reporter and cochair of the Washington-Baltimore News Guild, which represents the Post’s newsroom. “But this technology seems to have taken over only some of the grunt work.” Consider the election returns: In November 2012, it took four employees 25 hours to compile and post just a fraction of the election results manually. In November 2016, Heliograf created more than 500 articles, with little human intervention, that drew more than 500,000 clicks. (A drop in the bucket for the Post’s 1.1 billion pageviews that month, but it’s early days.)

Gilbert says the next step is to use Heliograf to keep the data in both machine- and human-written stories up-to-date. For instance, if someone shares a Tuesday story on Thursday, and the facts change in the meantime, Heliograf will automatically update the story with the most recent facts. Gilbert sees Heliograf developing the potential to function like a rewrite desk, in which “the reporters who gather information write more discrete chunks—here’s some facts, here’s some analysis—and let the system assemble them.”

With the rapid advances in AI technology driven by cheap computing power, Prakash sees Heliograf moving beyond mere grunt work. In time, he believes, it could do things like search the web to see what people are talking about, check the Post to see if that story is being covered, and, if not, alert editors or just write the piece itself. Of course, that’s where things could get sticky—when Facebook fired the human editors of its Trending module last year and let an algorithm curate the news, the world soon learned (falsely) that Megyn Kelly had been fired from Fox News. “Will there be controversy when the bot thinks this is important, and humans say this is important, and they’re the exact opposite thing?” Prakash asks. “It’s going to get interesting.”

The Post, like every other major news organization, is looking to tap new revenue streams, and it’s reportedly in talks to license out its CMS to clients like Tronc, a consortium that includes the Chicago Tribune, the Los Angeles Times, and dozens of other regional papers. As those newsrooms struggle with dwindling resources, it’s not hard to imagine a future in which AI plays a larger and larger role in creating journalism. Whether that’s good news for journalists and readers is another story.

Joe Keohane is a (human) writer living in New York City.

This article appears in the March issue. Subscribe now.