valid

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 3589 papers

Title	Date	Tasks	Status	Hype
What Has Been Lost with Synthetic Evaluation?	May 28, 2025	NegationReading Comprehension	—Unverified	0
Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models	May 27, 2025	valid	CodeCode Available	0
STACI: Spatio-Temporal Aleatoric Conformal Inference	May 27, 2025	Gaussian ProcessesGPU	—Unverified	0
PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects	May 27, 2025	Privacy PreservingUncertainty Quantification	—Unverified	0
Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners	May 26, 2025	MuJoCovalid	—Unverified	0
On the Robustness of RSMA to Adversarial BD-RIS-Induced Interference	May 26, 2025	valid	—Unverified	0
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach	May 26, 2025	TARvalid	—Unverified	0
HomeBench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices	May 26, 2025	In-Context LearningRetrieval-augmented Generation	CodeCode Available	0
We Need to Measure Data Diversity in NLP -- Better and Broader	May 26, 2025	Diversityvalid	—Unverified	0
PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation	May 26, 2025	valid	—Unverified	0
Optimal Conformal Prediction under Epistemic Uncertainty	May 25, 2025	Conformal PredictionPrediction	CodeCode Available	0
NTIRE 2025 Challenge on Video Quality Enhancement for Video Conferencing: Datasets, Methods and Results	May 25, 2025	validVideo Quality Assessment	CodeCode Available	0
Efficient Long CoT Reasoning in Small Language Models	May 24, 2025	Mathematical Reasoningvalid	—Unverified	0
MedScore: Factuality Evaluation of Free-Form Medical Answers	May 24, 2025	FormHallucination	CodeCode Available	0
Graph Style Transfer for Counterfactual Explainability	May 23, 2025	counterfactualCounterfactual Explanation	CodeCode Available	0
Flexible MOF Generation with Torsion-Aware Flow Matching	May 23, 2025	valid	—Unverified	0
Anytime-valid, Bayes-assisted,Prediction-Powered Inference	May 23, 2025	Predictionvalid	—Unverified	0
Efficient Adaptive Experimentation with Non-Compliance	May 23, 2025	valid	CodeCode Available	0
Applications of Modular Co-Design for De Novo 3D Molecule Generation	May 23, 2025	3D Molecule GenerationDenoising	—Unverified	0
Effects of auditory distance cues and reverberation on spatial perception and listening strategies	May 23, 2025	valid	CodeCode Available	0
Statistical Inference for Online Algorithms	May 22, 2025	valid	CodeCode Available	0
MuseRAG: Idea Originality Scoring At Scale	May 22, 2025	RAGRetrieval-augmented Generation	CodeCode Available	0
A collaborative constrained graph diffusion model for the generation of realistic synthetic molecules	May 22, 2025	valid	CodeCode Available	0
Statistical Test for Saliency Maps of Graph Neural Networks via Selective Inference	May 22, 2025	valid	—Unverified	0
Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack	May 21, 2025	Multiple-choiceMultiple Choice Question Answering (MCQA)	—Unverified	0
Are Vision-Language Models Safe in the Wild? A Meme-Based Benchmark Study	May 21, 2025	valid	—Unverified	0
Loss-Guided Auxiliary Agents for Overcoming Mode Collapse in GFlowNets	May 21, 2025	Diversityvalid	—Unverified	0
ModelingAgent: Bridging LLMs and Mathematical Modeling for Real-World Challenges	May 21, 2025	Mathvalid	CodeCode Available	1
Projection-Based Correction for Enhancing Deep Inverse Networks	May 21, 2025	valid	—Unverified	0
Temporal Alignment of Time Sensitive Facts with Activation Engineering	May 20, 2025	valid	—Unverified	0
Valid Post-Contextual Bandit Inference	May 20, 2025	Translationvalid	—Unverified	0
Learning to Insert for Constructive Neural Vehicle Routing Solver	May 20, 2025	Model OptimizationPosition	—Unverified	0
A Comprehensive Benchmarking Platform for Deep Generative Models in Molecular Design	May 19, 2025	BenchmarkingDrug Discovery	—Unverified	0
NTIRE 2025 Challenge on Efficient Burst HDR and Restoration: Datasets, Methods, and Results	May 17, 2025	valid	—Unverified	0
Coherent Language Reconstruction from Brain Recordings with Flexible Multi-Modal Input Stimuli	May 15, 2025	valid	—Unverified	0
Better Understanding Triple Differences Estimators	May 15, 2025	valid	—Unverified	0
A spherical amplitude-phase formulation for 3-D adaptive line-of-sight (ALOS) guidance with USGES stability guarantees	May 13, 2025	valid	—Unverified	0
Feature Fitted Online Conformal Prediction for Deep Time Series Forecasting Model	May 13, 2025	Conformal PredictionPrediction	CodeCode Available	0
Sharp Gaussian approximations for Decentralized Federated Learning	May 12, 2025	Federated Learningvalid	—Unverified	0
Measuring General Intelligence with Generated Games	May 12, 2025	In-Context LearningLarge Language Model	CodeCode Available	1
Transfer Learning Across Fixed-Income Product Classes	May 12, 2025	Transfer Learningvalid	—Unverified	0
Generalization Bounds and Stopping Rules for Learning with Self-Selected Data	May 12, 2025	Active LearningGeneralization Bounds	—Unverified	0
Chronocept: Instilling a Sense of Time in Machines	May 12, 2025	Fact CheckingRAG	CodeCode Available	1
LLM-Augmented Chemical Synthesis and Design Decision Programs	May 11, 2025	Decision MakingMulti-step retrosynthesis	—Unverified	0
Tell Me Who Your Students Are: GPT Can Generate Valid Multiple-Choice Questions When Students' (Mis)Understanding Is Hinted	May 9, 2025	Language ModelingLanguage Modelling	—Unverified	0
Evolutionary thoughts: integration of large language models and evolutionary algorithms	May 9, 2025	Evolutionary AlgorithmsHallucination	CodeCode Available	0
Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs	May 8, 2025	reinforcement-learningReinforcement Learning	—Unverified	0
Fair Uncertainty Quantification for Depression Prediction	May 8, 2025	Conformal PredictionFairness	—Unverified	0
PlaceIt3D: Language-Guided Object Placement in Real 3D Scenes	May 8, 2025	valid	—Unverified	0
LLM Code Customization with Visual Results: A Benchmark on TikZ	May 7, 2025	Code Generationvalid	—Unverified	0

Show:10 25 50

← PrevPage 2 of 72Next →

No leaderboard results yet.