Stephen Friend is transforming the culture and practice of biomedical research to align with and support health outcomes.
In the life sciences, the existing approach to conducting research is inhibited by the traditional reward structures of academia and industry—publishing in the former and patenting in the others. These proprietary practices inhibit advances in our understanding of disease because of the way in which they reinforce data hoarding rather than data sharing.
Stephen is introducing a new way of working for the research community based on collaboration among genomic and biomedical scientists in various settings in order to speed research, treatments, and cures. At the core of the idea is a “precompetitive commons,” a space where researchers can convene, interact, give and take basic research, and build upon one another’s insights in an environment governed by neither academia nor industry. It’s a novel combination of two existing concepts. The first is the Commons—the idea, most famously expressed by Creative Commons, of a community where information is shared in order to achieve greater social benefits. The second is a precompetitive space—the idea that there is basic research that informs all efforts to improve the health of people and families.
Begun in 2009, the Commons is an online, open access repository of data sets and models from contributor-scientists in academia and industry—Stephen hopes that it will become for the life sciences what Wikipedia is for the encyclopedic sciences. But this platform is only one piece of the Commons which is, importantly, a set of norms and practices—a new culture of working for the health of patients (and we will all be patients someday).
It has been twelve years since the completion of the Human Genome Project, the sequencing of three billion human genetic units that some experts hoped would yield a wealth of insights of immediate benefit to human health. Yet the medical promise of the project, and of the explosion of biological data that has occurred in the past several decades, remains largely unrealized. While data is increasing exponentially, the number of new drug therapies and novel diagnostic tools is not. In fact, though last year major pharmaceutical companies spent nearly double on research and development than they had in the year the Human Genome Project went public, the rate of drug approval has remained constant.
We seem to have stalled—why? To some degree, it’s because common diseases are more complicated than scientists had predicted. Unlike rare diseases, which tend to map to a variant of a single gene, common diseases seem to involve an interaction of many different biological factors and many rare genetic variants. But complex biology is only half of the reason. The other part is that the existing drug development model has fundamental flaws. When the cost of bringing a single drug to market is $800 million, many in the pharma industry are beginning to see that this model may not work for long. The social cost of inefficiency is enormous—the delay in cures and health for people and families.
At the root of the inefficiency in drug development is the fact that the current research paradigms in the two major biomedical communities—academia and industry—see basic biological information not as precompetitive but as a proprietary asset. The reward structures in these environments are designed to foster academic and commercial gains, not advances for health. For academic scientists, publishing articles in peer-reviewed journals is the currency for gaining promotion and, ultimately, tenure. This “public or perish” environment hardly facilitates data sharing. Instead, it promotes the stockpiling of data until the point of publication. Highly valuable information about research processes—for instance, mathematical models or lab techniques a researcher may have tried that ultimately failed—often gets lost.
In the world of biotech and pharmaceutical companies, the patent substitutes for the publication, and research is aggressively proprietary. Each company works in its own silo, for obvious reasons. This means that major companies often perform duplicative research and separately devote resources to “inventing” the exact same tools, databases, models, and techniques.
Given the nature of research today, no single team—whether in academia or industry—has the resources to make a significant discovery on its own. Getting to faster cures and treatments will require collaborative approaches to gathering and organizing basic research in order to minimize duplicative research and maximize the output per public and corporate dollar spent.
An open source culture has significantly shaped other realms of information sharing—Wikipedia is perhaps the most familiar example. Other branches of science have experienced the shift toward a collaborative community: When physics, for instance, reached the point when it needed shared resources—capital-intensive equipment as particle accelerators and larger-than-life telescopes—the physics community adjusted, not because they are more sharing by temperament but because they needed each other to do better science. Given the time and money required for the work that must be done in the field of biology today, biomedical scientists can no longer afford to continue working in isolation.
Stephen is harnessing a critical moment of transition in genomic and biomedical science—when we move from working with small data sets to ones too large for any one lab to build—to transition the field to practices of collaboration and non-duplicative efforts, and, most importantly, a stronger and more explicit alignment with human health and patient outcomes. He is doing this through Sage Bionetworks, a Seattle-based organization that he started in 2009, now with a team of about twenty.
Through Sage, Stephen is building an online “commons” or repository that houses and makes publically available rich data sets—called “globally coherent data sets”—to researchers in academia and industry. These data sets contain three levels of information: Genome-wide DNA variation, a so-called “intermediate trait” such as gene expression, and observable characteristics (what biologists call “phenotype.”) The Commons currently houses five complete sets, six more are in transition, and dozens more are promised to Sage. Although some of these existing datasets are already publicly available—whether on the National Institute of Health’s PubMed forum or on individual lab sites—there is neither a single hub nor even a single format for them. In fact, often the data is so haphazardly presented it is not even reusable. Sage gathers the datasets and actively curates them, making the information not only sharable, but usable. Thus, there is an incentive to share data with Sage beyond the promise of better research and faster cures: Sage adds value by curating the data, rendering a service for which the “fee” is the act of giving up proprietary rights to the data.
The Commons has been jump-started through contributions from Merck: When Stephen left Merck a year and a half ago, he convinced Merck to donate clinical/genomic data that had cost them $100 million to develop, all the needed “know-how,” and the massive 5000 node high performance computer cluster required for the mathematical modeling.
The Commons effort is both an enabler and outcome of the larger shift Stephen is working on: The cultural shift that will lead to better alignment among actors, and better health. These actors are academic researchers, pharm and biotech companies, publishers, and funders of research.
To address partners in academia, Stephen formed a group of five top laboratories that have agreed to work together not for the reason that they are collaborative in spirit, but because they can do better science. This group, which calls itself the “Federation” has been meeting in earnest for the past six months and is working together on disease models for aging, type II diabetes, among others. Stephen says that the design and function of the Federation is modeled after Arpanet, the collaboration that gave rise to the Internet. These collective efforts—beyond the reach of any one lab—will set important examples for traditionalist researchers in the known currency of their field—peer-reviewed publication. Members of the Federation are embedded at key institutions in the U.S.—Columbia, Stanford, UCSF, and UCSD. (Stephen also works extensively in Europe, and now China, which—as of 2010—is the second largest national source of scientific papers.)
Using his personal networks initially, Stephen engaged these pharma companies to contribute some “precompetitive data.” This is an important step and while the agreements are verbal at this point, and don’t yet mean data on the site, it’s a signifier that there is openness to pursuing a collaborative approach when there is a trusted and neutral entity and space—this is an important step in fostering a culture of pharma companies working together on the basic science that each is now doing separately, and agreeing to share some data.
Success will be slower with publishers, but Stephen’s aim is to incentivize real-time sharing that also builds credibility for researchers. He aims to cut down the “unit” of publication or citation from a paper—takes too long, causing data hording and missed opportunities for patients—to a unit more fundamental like a model, or a method. Stephen is realistic that this isn’t an area he expects to advance quickly; instead, a more important win will be working with the Federation to achieve success in publication in order to be able to gradually change it.
And lastly, as funders are key influencers of behavior, Stephen’s team is working with major private foundations in various parts of the world to structure grants so that researchers sign on to principles and practices that build the larger effort, and get the community comfortable with sharing data—failures and successes—in a way that informs the whole, allows iterative advancement, and delivers faster, less costly outcomes for health.
Nearly two years in, Stephen’s effort is well-funded through a mix of funders, including NIH, NCI, the State of Washington, Pfizer, and Merck. He expects future funding to come from government, foundations, and industry.
Stephen’s parents, both Julliard professors, showed him that work and passion could be joined, and encouraged him to pursue a path that brought him meaning. He found this in a career in the health sciences starting with summer research at Johns Hopkins as a teenager. In college, he focused on the sciences broadly—chemistry and biology, and also sociology and cultural anthropology, as he has always been interested in social norms, how groups of people organize themselves, and so on. Following college, Stephen went to medical school and completed doctoral studies in biochemistry.
Early in his career, he treated disease symptoms in hospital settings and taught medicine. Prompted by the case of a father and son who exhibited a rare eye tumor, he became interested in genomics, in getting to the root of disease. At the Fred Hutchinson Cancer Research Center, he co-developed an approach to identifying patterns in genes that has become widely used, leading him to found his own bioinformatics company, the pioneering Seattle-based Rosetta Inpharmatics. He sold Rosetta to Merck, and joined Merck’s Seattle team as head of basic cancer research for six years. His time there was productive—he was part of developing seven new drug therapies—and allowed him to build contacts throughout the industry.
However, Stephen began to grow increasingly convinced that the drug development process, and the academic-industry research enterprise that makes it possible, was not set up to serve patients’ interests. In fact, many of the incentives worked against best outcomes for patients. Stephen started Sage to address the system as a whole, drawing on his experience in each of the various siloes that, in his view, need to align for best outcomes for people.