| Open Source Planning & Control System with Language Agents for Autonomous Scientific Discovery | Jul 9, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Topic Modeling and Link-Prediction for Material Property Discovery | Jul 8, 2025 | Knowledge GraphsLink Prediction | —Unverified | 0 |
| STRUCTSENSE: A Task-Agnostic Agentic Framework for Structured Information Extraction with Human-In-The-Loop Evaluation and Benchmarking | Jul 4, 2025 | BenchmarkingNavigate | CodeCode Available | 0 |
| Distributed Cross-Channel Hierarchical Aggregation for Foundation Models | Jun 26, 2025 | Computational Efficiencyscientific discovery | —Unverified | 0 |
| Active Inference AI Systems for Scientific Discovery | Jun 26, 2025 | counterfactualCounterfactual Reasoning | —Unverified | 0 |
| A Survey of AI for Materials Science: Foundation Models, LLM Agents, Datasets, and Tools | Jun 25, 2025 | Continual LearningDomain Generalization | —Unverified | 0 |
| AI Assistants to Enhance and Exploit the PETSc Knowledge Base | Jun 25, 2025 | RAGReranking | —Unverified | 0 |
| From Reproduction to Replication: Evaluating Research Agents with Progressive Code Masking | Jun 24, 2025 | Code Generationscientific discovery | CodeCode Available | 0 |
| AutomataGPT: Forecasting and Ruleset Inference for Two-Dimensional Cellular Automata | Jun 19, 2025 | scientific discovery | —Unverified | 0 |
| LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling Research | Jun 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Graphics4Science: Computer Graphics for Scientific Impacts | Jun 18, 2025 | scientific discovery | —Unverified | 0 |
| An ELIXIR scoping review on domain-specific evaluation metrics for synthetic data in life sciences | Jun 17, 2025 | scientific discoverySynthetic Data Evaluation | —Unverified | 0 |
| Evolvable Conditional Diffusion | Jun 16, 2025 | DenoisingDescriptive | —Unverified | 0 |
| Scientifically-Interpretable Reasoning Network (ScIReN): Uncovering the Black-Box of Nature | Jun 16, 2025 | scientific discovery | —Unverified | 0 |
| Interpretable representation learning of quantum data enabled by probabilistic variational autoencoders | Jun 13, 2025 | Interpretable Machine LearningRepresentation Learning | —Unverified | 0 |
| ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change Queries | Jun 12, 2025 | scientific discovery | CodeCode Available | 1 |
| HSG-12M: A Large-Scale Spatial Multigraph Dataset | Jun 10, 2025 | Graph Learningscientific discovery | CodeCode Available | 1 |
| AutoSDT: Scaling Data-Driven Discovery Tasks Toward Open Co-Scientists | Jun 9, 2025 | scientific discoveryvalid | —Unverified | 0 |
| ALINE: Joint Amortization for Bayesian Inference and Active Data Acquisition | Jun 8, 2025 | Active LearningBayesian Inference | CodeCode Available | 0 |
| Can Theoretical Physics Research Benefit from Language Agents? | Jun 6, 2025 | Code GenerationMathematical Reasoning | —Unverified | 0 |
| Unsupervised Machine Learning for Scientific Discovery: Workflow and Best Practices | Jun 5, 2025 | Astronomyscientific discovery | CodeCode Available | 0 |
| Matter-of-Fact: A Benchmark for Verifying the Feasibility of Literature-Supported Claims in Materials Science | Jun 4, 2025 | ArticlesCode Generation | CodeCode Available | 0 |
| Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony | Jun 3, 2025 | Kolmogorov-Arnold Networksscientific discovery | —Unverified | 0 |
| A Dynamic Framework for Semantic Grouping of Common Data Elements (CDE) Using Embeddings and Clustering | Jun 2, 2025 | Clusteringscientific discovery | —Unverified | 0 |
| From Street Views to Urban Science: Discovering Road Safety Factors with Multimodal Large Language Models | Jun 2, 2025 | Large Language ModelMultimodal Large Language Model | —Unverified | 0 |
| SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents | May 29, 2025 | Adversarial AttackLarge Language Model | CodeCode Available | 1 |
| OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data | May 29, 2025 | scientific discovery | —Unverified | 0 |
| BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model | May 29, 2025 | Large Language Modelscientific discovery | CodeCode Available | 3 |
| LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization Algorithms | May 27, 2025 | Bayesian OptimizationBenchmarking | CodeCode Available | 2 |
| MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research | May 26, 2025 | scientific discovery | CodeCode Available | 1 |
| ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows | May 26, 2025 | Astronomyscientific discovery | —Unverified | 0 |
| AI-Researcher: Autonomous Scientific Innovation | May 24, 2025 | scientific discovery | CodeCode Available | 7 |
| BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases | May 23, 2025 | Causal Inferencescientific discovery | CodeCode Available | 0 |
| MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback | May 23, 2025 | scientific discovery | CodeCode Available | 0 |
| Improving Chemical Understanding of LLMs via SMILES Parsing | May 22, 2025 | Graph Matchingscientific discovery | —Unverified | 0 |
| PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration | May 21, 2025 | Large Language Modelscientific discovery | CodeCode Available | 1 |
| MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem | May 20, 2025 | Mathematical Reasoningscientific discovery | CodeCode Available | 3 |
| Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models | May 20, 2025 | Hallucinationscientific discovery | CodeCode Available | 0 |
| From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery | May 19, 2025 | Navigatescientific discovery | CodeCode Available | 3 |
| Robin: A multi-agent system for automating scientific discovery | May 19, 2025 | scientific discovery | CodeCode Available | 0 |
| InterFeat: An Automated Pipeline for Finding Interesting Hypotheses in Structured Biomedical Data | May 18, 2025 | Knowledge Graphsscientific discovery | CodeCode Available | 0 |
| When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research | May 17, 2025 | Misconceptionsscientific discovery | CodeCode Available | 0 |
| AI-Driven Automation Can Become the Foundation of Next-Era Science of Science Research | May 17, 2025 | scientific discovery | CodeCode Available | 2 |
| On the definition and importance of interpretability in scientific machine learning | May 16, 2025 | Equation DiscoveryInterpretable Machine Learning | —Unverified | 0 |
| Deep Symbolic Optimization: Reinforcement Learning for Symbolic Mathematics | May 16, 2025 | Equation Discoveryreinforcement-learning | —Unverified | 0 |
| Benchmarking AI scientists in omics data-driven biological research | May 13, 2025 | BenchmarkingMultiple-choice | CodeCode Available | 1 |
| Contributions of the Petabyte Scale Sequence Search Codeathon toward efforts to scale sequence-based searches on SRA | May 9, 2025 | Benchmarkingscientific discovery | —Unverified | 0 |
| Symbol-based entity marker highlighting for enhanced text mining in materials science with generative AI | May 9, 2025 | NERscientific discovery | —Unverified | 0 |
| Generative Discovery of Partial Differential Equations by Learning from Math Handbooks | May 9, 2025 | Computational EfficiencyMath | —Unverified | 0 |
| Soft causal learning for generalized molecule property prediction: An environment perspective | May 7, 2025 | Graph LearningProperty Prediction | —Unverified | 0 |