Case Study 1: The Sharing Project

Three years ago, Jinghui Zhang, chair of the department of computational biology at St. Jude Children’s Research Hospital in Memphis, tried to download a set of 1,547 samples from a government database so she could run an experiment on it. The download took a year and a half. Every time Zhang’s team thought they had the complete set of data, they performed a statistical check that was supposed to catch any transmission errors. Seven times, the check failed. Seven times, the team started working only to realize that part of the set was missing, and they’d have to start all over. “Data downloading is a very painful process,” Zhang says. “It can potentially drive researchers away from science. Period.”

“People think that there are strings attached. They don’t believe that we would do such a thing."

Zhang had already started to wonder how much faster cancer research could move if data sets were freely and immediately available to computational biologists who wanted to work on them, so in April she partnered with Microsoft to release St. Jude Cloud, a web-based data processor that offers access—within 48 hours—to the largest publicly available set of childhood cancer stats in the world, the whole genomes of more than 5,000 St. Jude patients. The sharing is so generous that it defies academic logic. “People think that there are strings attached to this,” says Zhang. “They don’t believe that we would do such a thing.”

Case Study 2: Machine Learning for Doctors

St. Jude is not the only organization that has seized on sharing and collaboration as a solution to stalled cancer research. In January 2018, Microsoft also partnered with the charity Stand Up to Cancer on an $11 million research program called Convergence 2.0 to solve a problem that is essentially the reverse of the one Zhang faced: researchers who have data sets or ideas worth studying, but little or no computer programming expertise.

“There was hardly a cancer center in the world that had the same people that Facebook, Microsoft, Amazon, and Google were getting to do deep-learning algorithms. These are two non-overlapping groups of people,” says Arnie Levine, professor emeritus at the Institute for Advanced Study and co-vice chairman of Stand Up to Cancer’s scientific advisory committee. “So what we did was we made the marriage.” Convergence 2.0 funded seven teams of researchers and medical doctors, matching each with machine-learning experts from places like Microsoft and MIT.

Case Study 3: Sharable Tumor Samples

Meanwhile, for clinicians, a new company called Paige.AI made an agreement with Memorial Sloan Kettering Cancer Center in New York to train machine-learning algorithms on its absolutely massive collection of 25 million digitized tumor slides. The software will eventually be able to help doctors all over the country diagnose cancer from biopsies the same way a top MSK pathologist would, only with even more accuracy, objectivity, and information about treatment options and potential survival.


This appears in the September 2018 issue. Want more Popular Mechanics? Get Instant Access!

Headshot of Jacqueline Detwiler-George
Jacqueline Detwiler-George
Jacqueline Detwiler-George has a master's degree in neuroscience and has contributed to Wired, Esquire, Fast Company, and Best American Science and Nature Writing. She's the former articles editor at Popular Mechanics and former host of the Most Useful Podcast Ever.