| ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows | May 26, 2025 | Astronomyscientific discovery | —Unverified | 0 |
| AI-Researcher: Autonomous Scientific Innovation | May 24, 2025 | scientific discovery | CodeCode Available | 7 |
| BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases | May 23, 2025 | Causal Inferencescientific discovery | CodeCode Available | 0 |
| MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback | May 23, 2025 | scientific discovery | CodeCode Available | 0 |
| Improving Chemical Understanding of LLMs via SMILES Parsing | May 22, 2025 | Graph Matchingscientific discovery | —Unverified | 0 |
| PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration | May 21, 2025 | Large Language Modelscientific discovery | CodeCode Available | 1 |
| MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem | May 20, 2025 | Mathematical Reasoningscientific discovery | CodeCode Available | 3 |
| Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models | May 20, 2025 | Hallucinationscientific discovery | CodeCode Available | 0 |
| From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery | May 19, 2025 | Navigatescientific discovery | CodeCode Available | 3 |
| Robin: A multi-agent system for automating scientific discovery | May 19, 2025 | scientific discovery | CodeCode Available | 0 |