SOTAVerified|Agents Browse Leaderboard About

valid

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–475 of 3589 papers

Title	Date	Tasks	Status	Hype
Statistically Valid Information Bottleneck via Multiple Hypothesis Testing	Sep 11, 2024	valid	—Unverified	0
Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement	Sep 10, 2024	Multiple-choiceSentence	—Unverified	0
Improving Conditional Level Generation using Automated Validation in Match-3 Games	Sep 10, 2024	valid	—Unverified	0
Alignist: CAD-Informed Orientation Distribution Estimation by Fusing Shape and Correspondences	Sep 10, 2024	Contrastive Learningvalid	CodeCode Available	0
NSP: A Neuro-Symbolic Natural Language Navigational Planner	Sep 10, 2024	valid	—Unverified	0
The Surprising Robustness of Partial Least Squares	Sep 9, 2024	Dimensionality Reductionvalid	—Unverified	0
Inference for Large Scale Regression Models with Dependent Errors	Sep 8, 2024	Gaussian Processesparameter estimation	—Unverified	0
Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models	Sep 7, 2024	MMLUTruthfulQA	—Unverified	0
Leveraging Machine Learning for Official Statistics: A Statistical Manifesto	Sep 6, 2024	Surveyvalid	—Unverified	0
FuzzCoder: Byte-level Fuzzing Test via Large Language Model	Sep 3, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Federated Prediction-Powered Inference from Decentralized Data	Sep 3, 2024	Federated LearningPrediction	CodeCode Available	0
An essay on the history of DSGE models	Sep 1, 2024	valid	—Unverified	0
Stochastic Monotonicity and Random Utility Models: The Good and The Ugly	Sep 1, 2024	valid	—Unverified	0
"Is This It?": Towards Ecologically Valid Benchmarks for Situated Collaboration	Aug 30, 2024	Embodied Question AnsweringQuestion Answering	—Unverified	0
The creative psychometric item generator: a framework for item generation and validation using large language models	Aug 30, 2024	valid	—Unverified	0
Self-supervised learning for crystal property prediction via denoising	Aug 30, 2024	DenoisingPrediction	—Unverified	0
Continual learning with the neural tangent ensemble	Aug 30, 2024	Continual Learningvalid	—Unverified	0
Can Unconfident LLM Annotations Be Used for Confident Conclusions?	Aug 27, 2024	valid	CodeCode Available	1
Double/Debiased CoCoLASSO of Treatment Effects with Mismeasured High-Dimensional Control Variables	Aug 26, 2024	Econometricsvalid	—Unverified	0
EVINCE: Optimizing Multi-LLM Dialogues Using Conditional Statistics and Information Theory	Aug 26, 2024	Decision MakingDiversity	—Unverified	0
Investigating the effect of Mental Models in User Interaction with an Adaptive Dialog Agent	Aug 26, 2024	valid	—Unverified	0
RoCP-GNN: Robust Conformal Prediction for Graph Neural Networks in Node-Classification	Aug 25, 2024	Conformal PredictionLink Prediction	—Unverified	0
Learning Valid Dual Bounds in Constraint Programming: Boosted Lagrangian Decomposition with Self-Supervised Learning	Aug 22, 2024	Self-Supervised Learningvalid	—Unverified	0
Can You Trust Your Metric? Automatic Concatenation-Based Tests for Metric Validity	Aug 22, 2024	Language ModelingLanguage Modelling	—Unverified	0
AIM 2024 Challenge on Compressed Video Quality Assessment: Methods and Results	Aug 21, 2024	Image Manipulationvalid	CodeCode Available	1

Show:10 25 50

← PrevPage 19 of 144Next →

No leaderboard results yet.