| Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation | Jul 12, 2024 | Few-Shot Learningscientific discovery | CodeCode Available | 1 | 5 |
| MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research | May 26, 2025 | scientific discovery | CodeCode Available | 1 | 5 |
| Large Language Models are Zero Shot Hypothesis Proposers | Nov 10, 2023 | scientific discovery | CodeCode Available | 1 | 5 |
| Large Language Models for Scientific Synthesis, Inference and Explanation | Oct 12, 2023 | Code GenerationLanguage Modeling | CodeCode Available | 1 | 5 |
| ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change Queries | Jun 12, 2025 | scientific discovery | CodeCode Available | 1 | 5 |
| Evolving Scientific Discovery by Unifying Data and Background Knowledge with AI Hilbert | Aug 18, 2023 | Equation DiscoveryLogical Reasoning | CodeCode Available | 1 | 5 |
| AI Descartes: Combining Data and Theory for Derivable Scientific Discovery | Sep 3, 2021 | Automated Theorem ProvingBIG-bench Machine Learning | CodeCode Available | 1 | 5 |
| AIGS: Generating Science from AI-Powered Automated Falsification | Nov 17, 2024 | scientific discovery | CodeCode Available | 1 | 5 |
| Constructing Custom Thermodynamics Using Deep Learning | Aug 8, 2023 | Deep LearningPhysical Intuition | CodeCode Available | 1 | 5 |
| IRIS: Interactive Research Ideation System for Accelerating Scientific Discovery | Apr 23, 2025 | scientific discovery | CodeCode Available | 1 | 5 |