Sharing Evo 2, a new foundation model for biomolecular sciences, now available on NVIDIA BioNeMo. This is a collaboration between the Arc Institute, Stanford, UC Berkeley, UCSF, and NVIDIA. It's significant because it's trained on a massive dataset – nearly 9 trillion nucleotides of DNA, RNA, and protein sequences from across the tree of life.
Key aspects:
🧬 Genomic Scale: Trained on an enormous dataset covering diverse species. 🔬 Multimodal: Understands DNA, RNA, and protein sequences. 🧠Long Context: Can process sequences up to 1 million nucleotides at once. 🚀 Powerful Architecture: Uses a "StripedHyena 2" architecture for efficiency. ✅ Open Components: Key parts, including fine-tuning, are available via the open-source NVIDIA BioNeMo Framework. 🔓Available as NVIDIA NIM microservice.
They've already shown it can predict the effects of gene mutations with high accuracy, and even design functional CRISPR-Cas systems. It's a powerful tool for anyone working with biological sequence data.
So, while AlphaFold primarily predicted existing structures, Evo 2 opens the door to designing entirely new biological sequences for things like drug discovery, agriculture, and materials science. What new possibilities does this unlock?
This is massive. A model that not only interprets but designs new biological sequences could push biotech into entirely new territory. The ability to process million-nucleotide sequences means we’re looking at real potential for breakthroughs in genetic engineering, synthetic biology, and even personalized medicine. Excited to see how researchers put Evo 2 to work!
Congrats on the launch!
Best wishes and sending lots of wins to the team behind it :)
Hi everyone!
Sharing Evo 2, a new foundation model for biomolecular sciences, now available on NVIDIA BioNeMo. This is a collaboration between the Arc Institute, Stanford, UC Berkeley, UCSF, and NVIDIA. It's significant because it's trained on a massive dataset – nearly 9 trillion nucleotides of DNA, RNA, and protein sequences from across the tree of life.
Key aspects:
🧬 Genomic Scale: Trained on an enormous dataset covering diverse species.
🔬 Multimodal: Understands DNA, RNA, and protein sequences.
🧠Long Context: Can process sequences up to 1 million nucleotides at once.
🚀 Powerful Architecture: Uses a "StripedHyena 2" architecture for efficiency.
✅ Open Components: Key parts, including fine-tuning, are available via the open-source NVIDIA BioNeMo Framework.
🔓Available as NVIDIA NIM microservice.
They've already shown it can predict the effects of gene mutations with high accuracy, and even design functional CRISPR-Cas systems. It's a powerful tool for anyone working with biological sequence data.
So, while AlphaFold primarily predicted existing structures, Evo 2 opens the door to designing entirely new biological sequences for things like drug discovery, agriculture, and materials science. What new possibilities does this unlock?
@zaczuo The future of drug discovery, genetics, and synthetic biology just got a powerful upgrade! 🚀🔬
@zaczuo Awesome blend of Biotech & AI.
Shram
This is massive. A model that not only interprets but designs new biological sequences could push biotech into entirely new territory. The ability to process million-nucleotide sequences means we’re looking at real potential for breakthroughs in genetic engineering, synthetic biology, and even personalized medicine. Excited to see how researchers put Evo 2 to work!
Congrats on the launch!
Best wishes and sending lots of wins to the team behind it :)
Awesome project Zac!