SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 2095121000 of 474278 papers

TitleStatusHype
AdapTrack: Adaptive Thresholding-Based Matching For Multi-object TrackingCode1
CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and ForecastingCode1
RepairBench: Leaderboard of Frontier Models for Program RepairCode1
FlashMix: Fast Map-Free LiDAR Localization via Feature Mixing and Contrastive-Constrained Accelerated TrainingCode1
Prompt-Driven Temporal Domain Adaptation for Nighttime UAV TrackingCode1
CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language ModelsCode1
AL-GTD: Deep Active Learning for Gaze Target DetectionCode1
URIEL+: Enhancing Linguistic Inclusion and Usability in a Typological and Multilingual Knowledge BaseCode1
HR-Extreme: A High-Resolution Dataset for Extreme Weather ForecastingCode1
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video UnderstandingCode1
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language ModelsCode1
Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item CatalogsCode1
Dual Cone Gradient Descent for Training Physics-Informed Neural NetworksCode1
From Vision to Audio and Beyond: A Unified Model for Audio-Visual Representation and GenerationCode1
Underwater Image Enhancement with Physical-based Denoising Diffusion Implicit ModelsCode1
ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement LearningCode1
A comprehensive review and new taxonomy on superpixel segmentationCode1
Generative AI for fast and accurate statistical computation of fluidsCode1
LML-DAP: Language Model Learning a Dataset for Data-Augmented PredictionCode1
Improving Visual Object Tracking through Visual PromptingCode1
Cottention: Linear Transformers With Cosine AttentionCode1
DualAD: Dual-Layer Planning for Reasoning in Autonomous DrivingCode1
IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot CaptioningCode1
HydraViT: Stacking Heads for a Scalable ViTCode1
Task-recency bias strikes back: Adapting covariances in Exemplar-Free Class Incremental LearningCode1
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint GenerationCode1
MIO: A Foundation Model on Multimodal TokensCode1
InterNet: Unsupervised Cross-modal Homography Estimation Based on Interleaved Modality Transfer and Self-supervised Homography PredictionCode1
Realistic Evaluation of Model Merging for Compositional GeneralizationCode1
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoECode1
MECD: Unlocking Multi-Event Causal Discovery in Video ReasoningCode1
DarkSAM: Fooling Segment Anything Model to Segment NothingCode1
Revisiting Deep Ensemble Uncertainty for Enhanced Medical Anomaly DetectionCode1
CNCA: Toward Customizable and Natural Generation of Adversarial Camouflage for Vehicle DetectorsCode1
Leveraging Anthropometric Measurements to Improve Human Mesh Estimation and Ensure Consistent Body ShapesCode1
An Adversarial Perspective on Machine Unlearning for AI SafetyCode1
RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn JailbreakingCode1
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree SearchCode1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow PredictionCode1
LightAvatar: Efficient Head Avatar as Dynamic Neural Light FieldCode1
Wavelet-Driven Generalizable Framework for Deepfake Face Forgery DetectionCode1
Self-Distilled Depth Refinement with Noisy Poisson FusionCode1
Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose EstimationCode1
Autonomous Network Defence using Reinforcement LearningCode1
Trustworthy Text-to-Image Diffusion Models: A Timely and Focused SurveyCode1
A Framework for Standardizing Similarity Measures in a Rapidly Evolving FieldCode1
MALPOLON: A Framework for Deep Species Distribution ModelingCode1
Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEsCode1
GrEmLIn: A Repository of Green Baseline Embeddings for 87 Low-Resource Languages Injected with Multilingual Graph KnowledgeCode1
DMC-VB: A Benchmark for Representation Learning for Control with Visual DistractorsCode1
Show:102550
← PrevPage 420 of 9486Next →