| Better than classical? The subtle art of benchmarking quantum machine learning models | Mar 11, 2024 | BenchmarkingBinary Classification | CodeCode Available | 7 |
| Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine | Nov 28, 2023 | Electrical EngineeringExperimental Design | CodeCode Available | 5 |
| Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM Agents | Oct 17, 2024 | Experimental Design | CodeCode Available | 4 |
| NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals | Jul 18, 2024 | Experimental DesignGPU | CodeCode Available | 4 |
| Predicting from Strings: Language Model Embeddings for Bayesian Optimization | Oct 14, 2024 | Bayesian OptimizationExperimental Design | CodeCode Available | 3 |
| Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers | Sep 6, 2024 | Experimental Designscientific discovery | CodeCode Available | 3 |
| OmniPred: Language Models as Universal Regressors | Feb 22, 2024 | Experimental Designregression | CodeCode Available | 3 |
| Attention is not not Explanation | Aug 13, 2019 | Decision MakingDiagnostic | CodeCode Available | 3 |
| Reviving The Classics: Active Reward Modeling in Large Language Model Alignment | Feb 4, 2025 | Computational EfficiencyExperimental Design | CodeCode Available | 2 |
| Honegumi: An Interface for Accelerating the Adoption of Bayesian Optimization in the Experimental Sciences | Feb 4, 2025 | Bayesian OptimizationExperimental Design | CodeCode Available | 2 |
| Probing the limitations of multimodal language models for chemistry and materials research | Nov 25, 2024 | Experimental DesignSpatial Reasoning | CodeCode Available | 2 |
| Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System | Oct 12, 2024 | Experimental Designscientific discovery | CodeCode Available | 2 |
| OpenBox: A Python Toolkit for Generalized Black-box Optimization | Apr 26, 2023 | Experimental Design | CodeCode Available | 2 |
| hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices | Mar 9, 2021 | BIG-bench Machine LearningDiagnostic | CodeCode Available | 2 |
| BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization | Oct 14, 2019 | Bayesian OptimisationBayesian Optimization | CodeCode Available | 2 |
| A friendly introduction to triangular transport | Mar 27, 2025 | Bayesian InferenceDecision Making | CodeCode Available | 1 |
| Gemstones: A Model Suite for Multi-Faceted Scaling Laws | Feb 7, 2025 | Experimental DesignLanguage Modeling | CodeCode Available | 1 |
| Active Task Disambiguation with LLMs | Feb 6, 2025 | Experimental DesignQuestion Selection | CodeCode Available | 1 |
| Autonomous Microscopy Experiments through Large Language Model Agents | Dec 18, 2024 | BenchmarkingExperimental Design | CodeCode Available | 1 |
| Confident Teacher, Confident Student? A Novel User Study Design for Investigating the Didactic Potential of Explanations and their Impact on Uncertainty | Sep 10, 2024 | Experimental DesignExplainable artificial intelligence | CodeCode Available | 1 |
| Evaluating Multiview Object Consistency in Humans and Image Models | Sep 9, 2024 | Experimental Design | CodeCode Available | 1 |
| Toward Automated Simulation Research Workflow through LLM Prompt Engineering Design | Aug 28, 2024 | Experimental DesignPrompt Engineering | CodeCode Available | 1 |
| GitHub is an effective platform for collaborative and reproducible laboratory research | Aug 18, 2024 | Experimental DesignTransfer Learning | CodeCode Available | 1 |
| SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It) | Jun 25, 2024 | BenchmarkingExperimental Design | CodeCode Available | 1 |
| Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics | Mar 21, 2024 | DeepFake DetectionExperimental Design | CodeCode Available | 1 |
| LegalLens: Leveraging LLMs for Legal Violation Identification in Unstructured Text | Feb 6, 2024 | Experimental Design | CodeCode Available | 1 |
| Variance Alignment Score: A Simple But Tough-to-Beat Data Selection Method for Multimodal Contrastive Learning | Feb 3, 2024 | Contrastive LearningExperimental Design | CodeCode Available | 1 |
| ExPT: Synthetic Pretraining for Few-Shot Experimental Design | Oct 30, 2023 | Experimental DesignIn-Context Learning | CodeCode Available | 1 |
| Sustainable Concrete via Bayesian Optimization | Oct 27, 2023 | Bayesian OptimizationExperimental Design | CodeCode Available | 1 |
| A Practical Recipe for Federated Learning Under Statistical Heterogeneity Experimental Design | Jul 28, 2023 | Experimental DesignFederated Learning | CodeCode Available | 1 |
| CeBed: A Benchmark for Deep Data-Driven OFDM Channel Estimation | Jun 23, 2023 | Experimental Design | CodeCode Available | 1 |
| The Machine Psychology of Cooperation: Can GPT models operationalise prompts for altruism, cooperation, competitiveness and selfishness in economic games? | May 13, 2023 | Experimental DesignLanguage Modelling | CodeCode Available | 1 |
| CO-BED: Information-Theoretic Contextual Optimization via Bayesian Experimental Design | Feb 27, 2023 | Experimental Design | CodeCode Available | 1 |
| Comparing Well and Geophysical Data for Temperature Monitoring Within a Bayesian Experimental Design Framework | Oct 19, 2022 | Experimental DesignTime Series Regression | CodeCode Available | 1 |
| New Paradigms for Exploiting Parallel Experiments in Bayesian Optimization | Oct 3, 2022 | Bayesian OptimizationExperimental Design | CodeCode Available | 1 |
| Active Learning for Optimal Intervention Design in Causal Models | Sep 10, 2022 | Active LearningExperimental Design | CodeCode Available | 1 |
| Initial recommendations for performing, benchmarking, and reporting single-cell proteomics experiments | Jul 19, 2022 | BenchmarkingExperimental Design | CodeCode Available | 1 |
| Derivative-Informed Neural Operator: An Efficient Framework for High-Dimensional Parametric Derivative Learning | Jun 21, 2022 | Dimensionality ReductionExperimental Design | CodeCode Available | 1 |
| Marginal Post Processing of Bayesian Inference Products with Normalizing Flows and Kernel Density Estimators | May 25, 2022 | Bayesian InferenceExperimental Design | CodeCode Available | 1 |
| VICE: Variational Interpretable Concept Embeddings | May 2, 2022 | Experimental DesignObject | CodeCode Available | 1 |
| Interventions, Where and How? Experimental Design for Causal Models at Scale | Mar 3, 2022 | Causal DiscoveryExperimental Design | CodeCode Available | 1 |
| Optimizing Sequential Experimental Design with Deep Reinforcement Learning | Feb 2, 2022 | Deep Reinforcement LearningExperimental Design | CodeCode Available | 1 |
| Learning High-Dimensional Parametric Maps via Reduced Basis Adaptive Residual Networks | Dec 14, 2021 | Experimental DesignVocal Bursts Intensity Prediction | CodeCode Available | 1 |
| An Experimental Design Perspective on Model-Based Reinforcement Learning | Dec 9, 2021 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Implicit Deep Adaptive Design: Policy-Based Experimental Design without Likelihoods | Nov 3, 2021 | Experimental Design | CodeCode Available | 1 |
| Emulation of physical processes with Emukit | Oct 25, 2021 | Bayesian OptimizationDecision Making | CodeCode Available | 1 |
| GeneDisco: A Benchmark for Experimental Design in Drug Discovery | Oct 22, 2021 | Active LearningDrug Discovery | CodeCode Available | 1 |
| What Does TERRA-REF's High Resolution, Multi Sensor Plant Sensing Public Domain Data Offer the Computer Vision Community? | Jul 29, 2021 | Experimental DesignManagement | CodeCode Available | 1 |
| Deeper Learning By Doing: Integrating Hands-On Research Projects Into a Machine Learning Course | Jul 28, 2021 | BIG-bench Machine LearningExperimental Design | CodeCode Available | 1 |
| Edge Proposal Sets for Link Prediction | Jun 30, 2021 | Experimental DesignLink Prediction | CodeCode Available | 1 |