business resources
NVIDIA And Arc Institute Unveil Evo 2: The Largest AI Model for Biology
21 Feb 2025, 0:43 pm GMT
Image credit: NVIDIA
NVIDIA, in collaboration with the Arc Institute, unveils Evo 2, the most extensive AI model developed for biological research. This foundation model, now available through NVIDIA BioNeMo, is designed to predict and design the genetic code—DNA, RNA, and proteins—across all domains of life.
NVIDIA and the Arc Institute have announced the launch of Evo 2, the most advanced artificial intelligence model developed for biology. Evo 2, designed to predict and design the genetic code of all domains of life, represents a significant milestone in generative genomics.
The model is now accessible to researchers worldwide through the NVIDIA BioNeMo platform and as an NVIDIA NIM microservice, offering unprecedented opportunities in biomolecular research.
A new era in genomic research with Evo 2
Developed through collaboration between the Arc Institute, Stanford University, and UC Berkeley, Evo 2 is the largest publicly available AI model trained on genomic data. Built on the NVIDIA DGX Cloud platform, Evo 2 has been trained on nearly 9.3 trillion nucleotides—the fundamental units of DNA and RNA—spanning the genetic sequences of over 128,000 whole genomes from diverse organisms, including plants, animals, and bacteria.
The model is designed to enhance various scientific applications, including predicting the structure and function of proteins, identifying new molecules for healthcare and industrial applications, and evaluating the effects of genetic mutations.
“Evo 2 represents a major milestone for generative genomics,” said Patrick Hsu, co-founder and core investigator at the Arc Institute and assistant professor of bioengineering at UC Berkeley.
“By advancing our understanding of these fundamental building blocks of life, we can pursue solutions in healthcare and environmental science that are unimaginable today.”
Technological advancements and capabilities
Evo 2 incorporates a novel StripedHyena 2 architecture, developed with insights from OpenAI’s Greg Brockman. This architecture enables the model to process lengthy genetic sequences, up to 1 million tokens, providing a more comprehensive analysis of the connections between distant parts of an organism’s genome.
“Designing new biology has traditionally been a laborious, unpredictable and artisanal process,” said Brian Hie, assistant professor of chemical engineering at Stanford University and Arc Institute innovation investigator.
“With Evo 2, we make biological design of complex systems more accessible to researchers, enabling the creation of new and beneficial advances in a fraction of the time it would previously have taken.”
The model was trained over several months using 2,000 NVIDIA H100 GPUs via NVIDIA DGX Cloud on AWS, an advanced AI infrastructure that allows scientists to train and optimise large-scale models efficiently. NVIDIA’s expertise in AI scaling and optimisation played a crucial role in enhancing Evo 2’s computational capabilities.
Applications in healthcare, agriculture, and industrial science
Evo 2’s capabilities extend across various scientific fields. By analysing vast genetic datasets, the model offers new insights into healthcare, biotechnology, and materials science. Researchers have already demonstrated the model’s effectiveness in identifying disease-causing mutations and developing novel genetic sequences.
In the medical field, Evo 2 could assist in understanding gene variants associated with diseases and designing targeted treatments. In a study involving BRCA1, a gene linked to breast cancer, Evo 2 accurately predicted the impact of previously unrecognised mutations with 90% accuracy, helping accelerate research in precision medicine.
In agriculture, the model could aid in addressing global food shortages by offering insights into plant genetics. This could help develop climate-resilient and nutrient-rich crops, thereby improving food security worldwide.
Beyond healthcare and agriculture, Evo 2 has applications in industrial science, such as developing biofuels and engineering proteins to break down pollutants like oil and plastic.
“Deploying a model like Evo 2 is like sending a powerful new telescope out to the farthest reaches of the universe,” said Dave Burke, chief technology officer at the Arc Institute. “We know there’s immense opportunity for exploration, but we don’t yet know what we’re going to discover.”
Supporting open science and responsible AI development
The Arc Institute, established in 2021 with $650 million in funding, provides researchers with long-term scientific support, allowing them to focus on innovative discoveries rather than short-term grants. The institute’s core investigators hold renewable eight-year research terms alongside faculty appointments at partner universities, including Stanford University, UC Berkeley, and UC San Francisco.
To encourage collaborative scientific advancement, Evo 2 has been released as an open-source AI model. Researchers can access the model’s training data, code, and weights via Arc’s GitHub, and a user-friendly interface called Evo Designer has been developed to simplify biological sequence design.
Arc Institute has also partnered with Goodfire, an AI research lab, to develop an interpretability visualizer. This tool allows scientists to examine the key biological features the model recognises within genomic sequences, enhancing transparency in AI-driven research.
Ethical considerations and future developments
Recognising the potential risks associated with AI-powered biological research, Evo 2’s developers have implemented safeguards. The model’s training dataset excludes pathogens affecting humans and other complex organisms, and queries related to harmful pathogens will not yield productive results.
Tina Hernandez-Boussard and her team at Stanford University played a crucial role in ensuring the ethical and responsible development of Evo 2, incorporating guidelines to prevent misuse while maximising its potential for beneficial applications.
Looking ahead, the research team envisions Evo 2 serving as a foundation for specialised AI models in genomic research.
“In a loose way, you can think of the model almost like an operating system kernel—you can have all of these different applications that are built on top of it,” said Dave Burke.
“From predicting how single DNA mutations affect a protein's function to designing genetic elements that behave differently in different cell types, as we continue to refine the model and researchers begin using it in creative ways, we expect to see beneficial uses for Evo 2 we haven't even imagined yet.”
The role of Arc Institute and NVIDIA in AI-powered biology
Established in 2021 with $650 million in funding, the Arc Institute focuses on supporting long-term scientific research. The institute provides multiyear funding and advanced laboratory facilities, allowing researchers to pursue innovative projects without the constraints of short-term grants.
Core investigators at Arc receive eight-year renewable research terms, often held alongside faculty positions at Stanford University, UC Berkeley, and UC San Francisco. This structure enables scientists to conduct research in areas such as cancer, immune dysfunction, and neurodegenerative diseases.
NVIDIA’s contributions to Evo 2 include AI scaling expertise, high-performance computing resources, and infrastructure support through the NVIDIA DGX Cloud. The NVIDIA BioNeMo framework ensures that researchers can easily integrate Evo 2 into their biomedical and biotechnological projects.
Anthony Costa, director of digital biology at NVIDIA, highlighted the significance of the model,
“Evo 2 has fundamentally advanced our understanding of biological systems. By overcoming previous limitations in the scale of biological foundation models with a unique architecture and the largest integrated dataset of its kind, Evo 2 generalises across more known biology than any other model to date — and by releasing these capabilities broadly, the Arc Institute has given scientists around the world a new partner in solving humanity’s most pressing health and disease challenges.”
Share this
Pallavi Singal
Editor
Pallavi Singal is the Vice President of Content at ztudium, where she leads innovative content strategies and oversees the development of high-impact editorial initiatives. With a strong background in digital media and a passion for storytelling, Pallavi plays a pivotal role in scaling the content operations for ztudium's platforms, including Businessabc, Citiesabc, and IntelligentHQ, Wisdomia.ai, MStores, and many others. Her expertise spans content creation, SEO, and digital marketing, driving engagement and growth across multiple channels. Pallavi's work is characterised by a keen insight into emerging trends in business, technologies like AI, blockchain, metaverse and others, and society, making her a trusted voice in the industry.
previous
The Importance of Professional Translation for Global Business
next
The 20 wonders of automobile engineering in history