SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 93269350 of 474278 papers

TitleStatusHype
RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation LocalizationCode0
STORI: A Benchmark and Taxonomy for Stochastic EnvironmentsCode0
SoT: Structured-of-Thought Prompting Guides Multilingual Reasoning in Large Language ModelsCode0
ELMF4EggQ: Ensemble Learning with Multimodal Feature Fusion for Non-Destructive Egg Quality AssessmentCode0
Real-Time Threaded Houbara Detection and Segmentation for Wildlife Conservation using Mobile PlatformsCode0
AdaRD-key: Adaptive Relevance-Diversity Keyframe Sampling for Long-form Video understandingCode0
Flip Distribution Alignment VAE for Multi-Phase MRI SynthesisCode0
Wave-GMS: Lightweight Multi-Scale Generative Model for Medical Image SegmentationCode0
Contextualized Representation Learning for Effective Human-Object Interaction DetectionCode0
MoGIC: Boosting Motion Generation via Intention Understanding and Visual ContextCode0
A Granular Study of Safety Pretraining under Model AbliterationCode0
MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive DecodingCode0
MonSTeR: a Unified Model for Motion, Scene, Text RetrievalCode0
Conditional Pseudo-Supervised Contrast for Data-Free Knowledge DistillationCode0
GuruAgents: Emulating Wise Investors with Prompt-Guided LLM AgentsCode0
From Supervision to Exploration: What Does Protein Language Model Learn During Reinforcement Learning?Code0
Rethinking Reward Models for Multi-Domain Test-Time ScalingCode0
Consistent Assistant Domains Transformer for Source-free Domain AdaptationCode0
More Thought, Less Accuracy? On the Dual Nature of Reasoning in Vision-Language Models0
MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance0
Think Right: Learning to Mitigate Under-Over Thinking via Adaptive, Attentive Compression0
MedQ-Bench: Evaluating and Exploring Medical Image Quality Assessment Abilities in MLLMs0
StelLA: Subspace Learning in Low-rank Adaptation using Stiefel ManifoldCode0
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation0
Drawing Conclusions from Draws: Rethinking Preference Semantics in Arena-Style LLM Evaluation0
Show:102550
← PrevPage 374 of 18972Next →