SOTAVerified

valid

Papers

Showing 51100 of 3589 papers

TitleStatusHype
What Has Been Lost with Synthetic Evaluation?0
Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language ModelsCode0
STACI: Spatio-Temporal Aleatoric Conformal Inference0
PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects0
Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners0
On the Robustness of RSMA to Adversarial BD-RIS-Induced Interference0
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach0
HomeBench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple DevicesCode0
We Need to Measure Data Diversity in NLP -- Better and Broader0
PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation0
Optimal Conformal Prediction under Epistemic UncertaintyCode0
NTIRE 2025 Challenge on Video Quality Enhancement for Video Conferencing: Datasets, Methods and ResultsCode0
Efficient Long CoT Reasoning in Small Language Models0
MedScore: Factuality Evaluation of Free-Form Medical AnswersCode0
Graph Style Transfer for Counterfactual ExplainabilityCode0
Flexible MOF Generation with Torsion-Aware Flow Matching0
Anytime-valid, Bayes-assisted,Prediction-Powered Inference0
Efficient Adaptive Experimentation with Non-ComplianceCode0
Applications of Modular Co-Design for De Novo 3D Molecule Generation0
Effects of auditory distance cues and reverberation on spatial perception and listening strategiesCode0
Statistical Inference for Online AlgorithmsCode0
MuseRAG: Idea Originality Scoring At ScaleCode0
A collaborative constrained graph diffusion model for the generation of realistic synthetic moleculesCode0
Statistical Test for Saliency Maps of Graph Neural Networks via Selective Inference0
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack0
Are Vision-Language Models Safe in the Wild? A Meme-Based Benchmark Study0
Loss-Guided Auxiliary Agents for Overcoming Mode Collapse in GFlowNets0
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World ChallengesCode1
Projection-Based Correction for Enhancing Deep Inverse Networks0
Temporal Alignment of Time Sensitive Facts with Activation Engineering0
Valid Post-Contextual Bandit Inference0
Learning to Insert for Constructive Neural Vehicle Routing Solver0
A Comprehensive Benchmarking Platform for Deep Generative Models in Molecular Design0
NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results0
Coherent Language Reconstruction from Brain Recordings with Flexible Multi-Modal Input Stimuli0
Better Understanding Triple Differences Estimators0
A spherical amplitude-phase formulation for 3-D adaptive line-of-sight (ALOS) guidance with USGES stability guarantees0
Feature Fitted Online Conformal Prediction for Deep Time Series Forecasting ModelCode0
Sharp Gaussian approximations for Decentralized Federated Learning0
Measuring General Intelligence with Generated GamesCode1
Transfer Learning Across Fixed-Income Product Classes0
Generalization Bounds and Stopping Rules for Learning with Self-Selected Data0
Chronocept: Instilling a Sense of Time in MachinesCode1
LLM-Augmented Chemical Synthesis and Design Decision Programs0
Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students' (Mis)Understanding Is Hinted0
Evolutionary thoughts: integration of large language models and evolutionary algorithmsCode0
Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs0
Fair Uncertainty Quantification for Depression Prediction0
PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes0
LLM Code Customization with Visual Results: A Benchmark on TikZ0
Show:102550
← PrevPage 2 of 72Next →

No leaderboard results yet.