SOTAVerified

Experimental Design

Papers

Showing 125 of 688 papers

TitleStatusHype
Better than classical? The subtle art of benchmarking quantum machine learning modelsCode7
Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in MedicineCode5
Chain of Ideas: Revolutionizing Research Via Novel Idea Development with LLM AgentsCode4
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model InternalsCode4
Predicting from Strings: Language Model Embeddings for Bayesian OptimizationCode3
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP ResearchersCode3
OmniPred: Language Models as Universal RegressorsCode3
Attention is not not ExplanationCode3
Reviving The Classics: Active Reward Modeling in Large Language Model AlignmentCode2
Honegumi: An Interface for Accelerating the Adoption of Bayesian Optimization in the Experimental SciencesCode2
Probing the limitations of multimodal language models for chemistry and materials researchCode2
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent SystemCode2
OpenBox: A Python Toolkit for Generalized Black-box OptimizationCode2
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning DevicesCode2
BoTorch: A Framework for Efficient Monte-Carlo Bayesian OptimizationCode2
A friendly introduction to triangular transportCode1
Gemstones: A Model Suite for Multi-Faceted Scaling LawsCode1
Active Task Disambiguation with LLMsCode1
Autonomous Microscopy Experiments through Large Language Model AgentsCode1
Confident Teacher, Confident Student? A Novel User Study Design for Investigating the Didactic Potential of Explanations and their Impact on UncertaintyCode1
Evaluating Multiview Object Consistency in Humans and Image ModelsCode1
Toward Automated Simulation Research Workflow through LLM Prompt Engineering DesignCode1
GitHub is an effective platform for collaborative and reproducible laboratory researchCode1
SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It)Code1
Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media ForensicsCode1
Show:102550
← PrevPage 1 of 28Next →

No leaderboard results yet.