The Invisible Threads of Scientific Progress: How a Woman’s Legacy Guided Google to a Nobel Prize — Exploring the Parallels Between Helen Berman and Rosalind Franklin

The 2024 Nobel Prize in Chemistry, awarded to Demis Hassabis and John Jumper of Google DeepMind for AlphaFold, is a moment in the fusion of artificial intelligence and biology. AlphaFold’s ability to predict protein structures with unprecedented accuracy is a triumph of computational power applied to one of biology’s most intricate challenges. Yet, as we reflect on this monumental achievement, it is vital to acknowledge the layers of scientific infrastructure and historical contributions that underpinned AlphaFold’s success. Among these contributions, the legacy of Helen Berman, co-founder of the Protein Data Bank (PDB), emerges as crucial yet largely uncelebrated in the public discourse.
Helen Berman’s establishment of the Protein Data Bank in the 1960s created the essential foundation upon which AlphaFold was trained. The PDB is an open-access repository that holds tens of thousands of experimentally determined protein structures, a resource built through decades of meticulous curation and collaboration. The dataset, which is indispensable for protein structure prediction, became a cornerstone for AI models like AlphaFold. Yet, despite this vital role, Berman’s work—and the labor of those who contributed to the creation and maintenance of the PDB—remains in the background, overshadowed by the more public-facing narrative of AI-driven innovation.
Berman’s story draws inevitable comparisons to Rosalind Franklin, the pioneering crystallographer whose X-ray diffraction images were pivotal to the discovery of the DNA double helix by Watson and Crick. While Franklin’s work was critical, she did not share in the Nobel recognition that came from it, a historical oversight that still resonates today. Both Franklin and Berman played indispensable roles in their respective scientific breakthroughs, but the tendency to focus on the individuals who synthesize the data, rather than those who generate or curate it, is a familiar theme in the history of scientific discovery.
The Nobel Prize, as one of the highest honors in science, is designed to celebrate transformative discoveries. However, it often reinforces the narrative of the “lone genius,” placing the final stage of discovery on a pedestal while overlooking the deep, collaborative foundations on which such breakthroughs are built. In the case of AlphaFold, it is tempting to spotlight the remarkable AI algorithms that predict protein folding. But without the painstaking assembly of protein structures in the PDB, there would be no dataset robust enough to train such a model. AlphaFold is, in many ways, a product of Helen Berman’s vision and the collaborative culture she fostered in structural biology.
The implications of this oversight extend beyond mere recognition. AlphaFold’s success was not just an algorithmic achievement but the culmination of decades of collective effort in biology, chemistry, and computational science. The Protein Data Bank is not a passive archive; it is an active, evolving resource, developed through the shared contributions of scientists around the world, often working without the prospect of widespread acclaim. As science moves forward into an era of even greater reliance on large-scale datasets and machine learning, it is critical to reassess how we value the different forms of scientific labor that contribute to discovery.
The 2024 Nobel Prize in Chemistry opens a necessary dialogue about the structures of scientific recognition. As we celebrate the immense promise of AI models like AlphaFold, we must also ensure that the invisible labor—the data curation, the meticulous experimental work, and the collaborative frameworks—receives its due. The contributions of women like Helen Berman remind us that science is a collective enterprise, where progress is made not only by the individuals who build the final model, but also by those who lay the groundwork.
In reconsidering how we honor scientific achievement, we may also question whether the criteria for accolades like the Nobel Prize should evolve. Should they better reflect the collaborative and infrastructural nature of modern science, particularly in fields like computational biology, where innovation is so deeply intertwined with data? The success of AlphaFold would not have been possible without the decades-long effort to build the PDB, just as the Watson and Crick model of DNA relied on the foundational images produced by Rosalind Franklin.
Ultimately, the 2024 Nobel Prize in Chemistry is an invitation to rethink how we define and recognize scientific contributions in an increasingly data-driven world. It is a moment to ensure that figures like Helen Berman—and the many other contributors who remain behind the scenes—are not overlooked in the celebration of breakthrough achievements. As science becomes more collaborative, so too should the stories we tell about its progress.