SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 97519775 of 474278 papers

TitleStatusHype
Benchmarking Uncertainty Disentanglement: Specialized Uncertainties for Specialized TasksCode2
ViewFusion: Towards Multi-View Consistency via Interpolated DenoisingCode2
Training Generative Image Super-Resolution Models by Wavelet-Domain Losses Enables Better Control of ArtifactsCode2
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem SolversCode2
CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place RecognitionCode2
Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identificationCode2
Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality EstimationCode2
Pre-training Differentially Private Models with Limited Public DataCode2
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented GenerationCode2
Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic SegmentationCode2
Boosting Neural Representations for Videos with a Conditional DecoderCode2
ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-TrainingCode2
The First Place Solution of WSDM Cup 2024: Leveraging Large Language Models for Conversational Multi-Doc QACode2
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and ReconstructionCode2
SparseLLM: Towards Global Pruning for Pre-trained Language ModelsCode2
Evaluating Quantized Large Language ModelsCode2
Misalignment-Robust Frequency Distribution Loss for Image TransformationCode2
Trends, Applications, and Challenges in Human Attention ModellingCode2
DecisionNCE: Embodied Multimodal Representations via Implicit Preference LearningCode2
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective RewardsCode2
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model RepresentationsCode2
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward EncodingsCode2
Sinkhorn Distance Minimization for Knowledge DistillationCode2
Retrieval is Accurate GenerationCode2
BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task TuningCode2
Show:102550
← PrevPage 391 of 18972Next →