SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 83268350 of 474278 papers

TitleStatusHype
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision MakingCode2
On Softmax Direct Preference Optimization for RecommendationCode2
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMsCode2
Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMsCode2
Explore the Limits of Omni-modal Pretraining at ScaleCode2
CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion ModelsCode2
STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite ImageryCode2
LRM-Zero: Training Large Reconstruction Models with Synthesized DataCode2
S^3 -- Semantic Signal SeparationCode2
BTS: Building Timeseries Dataset: Empowering Large-Scale Building AnalyticsCode2
Interpreting the Weight Space of Customized Diffusion ModelsCode2
Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMsCode2
StreamBench: Towards Benchmarking Continuous Improvement of Language AgentsCode2
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image RetrievalCode2
Towards Vision-Language Geo-Foundation Model: A SurveyCode2
Enhancing Diagnostic Accuracy in Rare and Common Fundus Diseases with a Knowledge-Rich Vision-Language ModelCode2
Dynamic Asset Allocation with Asset-Specific Regime ForecastsCode2
Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future DirectionsCode2
LVBench: An Extreme Long Video Understanding BenchmarkCode2
Large Language Models Must Be Taught to Know What They Don't KnowCode2
Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing DetectionCode2
DafnyBench: A Benchmark for Formal Software VerificationCode2
DehazeDCT: Towards Effective Non-Homogeneous Dehazing via Deformable Convolutional TransformerCode2
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-SpeechCode2
Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal ModelsCode2
Show:102550
← PrevPage 334 of 18972Next →