| SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents | May 29, 2025 | Adversarial AttackLarge Language Model | CodeCode Available | 1 |
| MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research | May 26, 2025 | scientific discovery | CodeCode Available | 1 |
| PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration | May 21, 2025 | Large Language Modelscientific discovery | CodeCode Available | 1 |
| Benchmarking AI scientists in omics data-driven biological research | May 13, 2025 | BenchmarkingMultiple-choice | CodeCode Available | 1 |
| IRIS: Interactive Research Ideation System for Accelerating Scientific Discovery | Apr 23, 2025 | scientific discovery | CodeCode Available | 1 |
| The AI Cosmologist I: An Agentic System for Automated Data Analysis | Apr 4, 2025 | scientific discovery | CodeCode Available | 1 |
| Offline Model-Based Optimization: Comprehensive Review | Mar 21, 2025 | modelNeural Architecture Search | CodeCode Available | 1 |
| MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research | Mar 17, 2025 | ArticlesBenchmarking | CodeCode Available | 1 |
| Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation | Feb 26, 2025 | Ingenuityscientific discovery | CodeCode Available | 1 |
| InductionBench: LLMs Fail in the Simplest Complexity Class | Feb 20, 2025 | scientific discovery | CodeCode Available | 1 |
| K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction Prediction | Feb 18, 2025 | Drug DiscoveryKnowledge Graphs | CodeCode Available | 1 |
| Transforming Science with Large Language Models: A Survey on AI-assisted Scientific Discovery, Experimentation, Content Generation, and Evaluation | Feb 7, 2025 | scientific discoverySurvey | CodeCode Available | 1 |
| AIGS: Generating Science from AI-Powered Automated Falsification | Nov 17, 2024 | scientific discovery | CodeCode Available | 1 |
| Geometric Representation Condition Improves Equivariant Molecule Generation | Oct 4, 2024 | Drug Designscientific discovery | CodeCode Available | 1 |
| BLADE: Benchmarking Language Model Agents for Data-Driven Science | Aug 19, 2024 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation | Jul 12, 2024 | Few-Shot Learningscientific discovery | CodeCode Available | 1 |
| LLM-SR: Scientific Equation Discovery via Programming with Large Language Models | Apr 29, 2024 | Equation DiscoveryInterpretable Machine Learning | CodeCode Available | 1 |
| GraphGPT: Graph Learning with Generative Pre-trained Transformers | Dec 31, 2023 | DecoderGraph Learning | CodeCode Available | 1 |
| Deep Generative Symbolic Regression | Dec 30, 2023 | FormHeuristic Search | CodeCode Available | 1 |
| A Transformer Model for Symbolic Regression towards Scientific Discovery | Dec 7, 2023 | regressionscientific discovery | CodeCode Available | 1 |
| Machine-Guided Discovery of a Real-World Rogue Wave Model | Nov 21, 2023 | Model Selectionregression | CodeCode Available | 1 |
| Large Language Models are Zero Shot Hypothesis Proposers | Nov 10, 2023 | scientific discovery | CodeCode Available | 1 |
| Modelling Cellular Perturbations with the Sparse Additive Mechanism Shift Variational Autoencoder | Nov 5, 2023 | DisentanglementDrug Discovery | CodeCode Available | 1 |
| Large Language Models for Scientific Synthesis, Inference and Explanation | Oct 12, 2023 | Code GenerationLanguage Modeling | CodeCode Available | 1 |
| Evolving Scientific Discovery by Unifying Data and Background Knowledge with AI Hilbert | Aug 18, 2023 | Equation DiscoveryLogical Reasoning | CodeCode Available | 1 |