| BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology | Feb 28, 2025 | Multiple-choicescientific discovery | CodeCode Available | 2 |
| Protein Large Language Models: A Comprehensive Survey | Feb 21, 2025 | ArticlesProtein Structure Prediction | CodeCode Available | 2 |
| DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra | Feb 13, 2025 | DecoderDe novo molecule generation from MS/MS spectrum (bonus chemical formulae) | CodeCode Available | 2 |
| From Generalist to Specialist: A Survey of Large Language Models for Chemistry | Dec 28, 2024 | scientific discoverySurvey | CodeCode Available | 2 |
| Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System | Oct 12, 2024 | Experimental Designscientific discovery | CodeCode Available | 2 |
| MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses | Oct 9, 2024 | scientific discoveryvalid | CodeCode Available | 2 |
| ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery | Oct 7, 2024 | scientific discovery | CodeCode Available | 2 |
| SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding | Aug 28, 2024 | Instruction Followingscientific discovery | CodeCode Available | 2 |
| OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI | Jun 18, 2024 | Benchmarkingscientific discovery | CodeCode Available | 2 |
| Flow of Reasoning:Training LLMs for Divergent Problem Solving with Minimal Examples | Jun 9, 2024 | ARCDiversity | CodeCode Available | 2 |