| BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments | May 27, 2024 | AI AgentBayesian Optimization | CodeCode Available | 2 |
| LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery | May 16, 2024 | Bilevel Optimizationscientific discovery | CodeCode Available | 2 |
| Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data | May 16, 2024 | Active Learningscientific discovery | CodeCode Available | 2 |
| SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models | Jan 15, 2024 | MathMathematical Reasoning | CodeCode Available | 2 |
| Multi-Fidelity Active Learning with GFlowNets | Jun 20, 2023 | Active LearningBayesian Optimization | CodeCode Available | 2 |
| Ten Quick Tips for Harnessing the Power of ChatGPT/GPT-4 in Computational Biology | Mar 29, 2023 | ChatbotPrompt Engineering | CodeCode Available | 2 |
| Accelerating Material Design with the Generative Toolkit for Scientific Discovery | Jul 8, 2022 | Drug DiscoveryMaterials Screening | CodeCode Available | 2 |
| LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling Research | Jun 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change Queries | Jun 12, 2025 | scientific discovery | CodeCode Available | 1 |
| HSG-12M: A Large-Scale Spatial Multigraph Dataset | Jun 10, 2025 | Graph Learningscientific discovery | CodeCode Available | 1 |