SOTAVerified|Agents Browse Leaderboard About

valid

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–225 of 3589 papers

Title	Date	Tasks	Status	Hype
Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation	Feb 27, 2025	Data AugmentationLogical Reasoning	—Unverified	0
Overcoming Dependent Censoring in the Evaluation of Survival Models	Feb 26, 2025	Survival Analysisvalid	CodeCode Available	0
Universality of conformal prediction under the assumption of randomness	Feb 26, 2025	Conformal PredictionPrediction	—Unverified	0
Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation	Feb 26, 2025	Question Answeringvalid	—Unverified	0
Shh, don't say that! Domain Certification in LLMs	Feb 26, 2025	valid	—Unverified	0
Uncertainty Quantification for LLM-Based Survey Simulations	Feb 25, 2025	SurveyUncertainty Quantification	—Unverified	0
Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization	Feb 25, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
Data-Driven Input-Output Control Barrier Functions	Feb 24, 2025	State Estimationvalid	—Unverified	0
Quantifying Logical Consistency in Transformers via Query-Key Alignment	Feb 24, 2025	Logical Reasoningvalid	—Unverified	0
REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction	Feb 24, 2025	Event Argument Extractionvalid	—Unverified	0
Your Assumed DAG is Wrong and Here's How To Deal With It	Feb 24, 2025	Causal Discoveryvalid	CodeCode Available	0
Auto-Bench: An Automated Benchmark for Scientific Discovery in LLMs	Feb 21, 2025	scientific discoveryvalid	—Unverified	0
Pricing Valid Cuts for Price-Match Equilibria	Feb 21, 2025	valid	—Unverified	0
EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations	Feb 20, 2025	Combinatorial Optimizationvalid	CodeCode Available	0
Towards a Perspectivist Turn in Argument Quality Assessment	Feb 20, 2025	valid	CodeCode Available	0
Explainable Distributed Constraint Optimization Problems	Feb 19, 2025	valid	—Unverified	0
Conformal Prediction under Levy-Prokhorov Distribution Shifts: Robustness to Local and Global Perturbations	Feb 19, 2025	Conformal PredictionPrediction	CodeCode Available	0
Generalization error bound for denoising score matching under relaxed manifold assumption	Feb 19, 2025	Denoisingvalid	—Unverified	0
What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis	Feb 19, 2025	HallucinationLanguage Modeling	—Unverified	0
Likelihood-Ratio Regularized Quantile Regression: Adapting Conformal Prediction to High-Dimensional Covariate Shifts	Feb 18, 2025	Conformal Predictionimage-classification	—Unverified	0
GiFT: Gibbs Fine-Tuning for Code Generation	Feb 17, 2025	Code Generationvalid	CodeCode Available	0
Deep Incomplete Multi-view Learning via Cyclic Permutation of VAEs	Feb 16, 2025	MULTI-VIEW LEARNINGRepresentation Learning	—Unverified	0
The Relationship between No-Regret Learning and Online Conformal Prediction	Feb 16, 2025	Conformal Predictionvalid	—Unverified	0
A new and flexible class of sharp asymptotic time-uniform confidence sequences	Feb 14, 2025	valid	—Unverified	0
Self-Normalized Inference in (Quantile, Expected Shortfall) Regressions for Time Series	Feb 14, 2025	quantile regressionregression	—Unverified	0

Show:10 25 50

← PrevPage 9 of 144Next →

No leaderboard results yet.