| SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents | May 29, 2025 | Adversarial AttackLarge Language Model | CodeCode Available | 1 |
| OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data | May 29, 2025 | scientific discovery | —Unverified | 0 |
| BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | May 29, 2025 | Large Language Modelscientific discovery | CodeCode Available | 3 |
| LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms | May 27, 2025 | Bayesian OptimizationBenchmarking | CodeCode Available | 2 |
| MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research | May 26, 2025 | scientific discovery | CodeCode Available | 1 |
| ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows | May 26, 2025 | Astronomyscientific discovery | —Unverified | 0 |
| AI-Researcher: Autonomous Scientific Innovation | May 24, 2025 | scientific discovery | CodeCode Available | 7 |
| BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases | May 23, 2025 | Causal Inferencescientific discovery | CodeCode Available | 0 |
| MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback | May 23, 2025 | scientific discovery | CodeCode Available | 0 |
| Improving Chemical Understanding of LLMs via SMILES Parsing | May 22, 2025 | Graph Matchingscientific discovery | —Unverified | 0 |
| PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration | May 21, 2025 | Large Language Modelscientific discovery | CodeCode Available | 1 |
| MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem | May 20, 2025 | Mathematical Reasoningscientific discovery | CodeCode Available | 3 |
| Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models | May 20, 2025 | Hallucinationscientific discovery | CodeCode Available | 0 |
| From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery | May 19, 2025 | Navigatescientific discovery | CodeCode Available | 3 |
| Robin: A multi-agent system for automating scientific discovery | May 19, 2025 | scientific discovery | CodeCode Available | 0 |
| InterFeat: An Automated Pipeline for Finding Interesting Hypotheses in Structured Biomedical Data | May 18, 2025 | Knowledge Graphsscientific discovery | CodeCode Available | 0 |
| When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research | May 17, 2025 | Misconceptionsscientific discovery | CodeCode Available | 0 |
| AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research | May 17, 2025 | scientific discovery | CodeCode Available | 2 |
| On the definition and importance of interpretability in scientific machine learning | May 16, 2025 | Equation DiscoveryInterpretable Machine Learning | —Unverified | 0 |
| Deep Symbolic Optimization: Reinforcement Learning for Symbolic Mathematics | May 16, 2025 | Equation Discoveryreinforcement-learning | —Unverified | 0 |
| Benchmarking AI scientists in omics data-driven biological research | May 13, 2025 | BenchmarkingMultiple-choice | CodeCode Available | 1 |
| Contributions of the Petabyte Scale Sequence Search Codeathon toward efforts to scale sequence-based searches on SRA | May 9, 2025 | Benchmarkingscientific discovery | —Unverified | 0 |
| Symbol-based entity marker highlighting for enhanced text mining in materials science with generative AI | May 9, 2025 | NERscientific discovery | —Unverified | 0 |
| Generative Discovery of Partial Differential Equations by Learning from Math Handbooks | May 9, 2025 | Computational EfficiencyMath | —Unverified | 0 |
| Soft causal learning for generalized molecule property prediction: An environment perspective | May 7, 2025 | Graph LearningProperty Prediction | —Unverified | 0 |