| Unsupervised Machine Learning for Scientific Discovery: Workflow and Best Practices | Jun 5, 2025 | Astronomyscientific discovery | CodeCode Available | 0 |
| Matter-of-Fact: A Benchmark for Verifying the Feasibility of Literature-Supported Claims in Materials Science | Jun 4, 2025 | ArticlesCode Generation | CodeCode Available | 0 |
| Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony | Jun 3, 2025 | Kolmogorov-Arnold Networksscientific discovery | —Unverified | 0 |
| A Dynamic Framework for Semantic Grouping of Common Data Elements (CDE) Using Embeddings and Clustering | Jun 2, 2025 | Clusteringscientific discovery | —Unverified | 0 |
| From Street Views to Urban Science: Discovering Road Safety Factors with Multimodal Large Language Models | Jun 2, 2025 | Large Language ModelMultimodal Large Language Model | —Unverified | 0 |
| OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data | May 29, 2025 | scientific discovery | —Unverified | 0 |
| SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents | May 29, 2025 | Adversarial AttackLarge Language Model | CodeCode Available | 1 |
| BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | May 29, 2025 | Large Language Modelscientific discovery | CodeCode Available | 3 |
| LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms | May 27, 2025 | Bayesian OptimizationBenchmarking | CodeCode Available | 2 |
| MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research | May 26, 2025 | scientific discovery | CodeCode Available | 1 |