SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 34013450 of 177340 papers

TitleStatusHype
Scaling Diffusion Models to Real-World 3D LiDAR Scene CompletionCode3
Delay-penalized CTC implemented based on Finite State TransducerCode3
BlackMamba: Mixture of Experts for State-Space ModelsCode3
SM3Det: A Unified Model for Multi-Modal Remote Sensing Object DetectionCode3
Reinforcement Learning for Reasoning in Large Language Models with One Training ExampleCode3
OneChart: Purify the Chart Structural Extraction via One Auxiliary TokenCode3
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual VideosCode3
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language ModelsCode3
StyleShot: A Snapshot on Any StyleCode3
Theia: Distilling Diverse Vision Foundation Models for Robot LearningCode3
BigCodec: Pushing the Limits of Low-Bitrate Neural Speech CodecCode3
Generating Synergistic Formulaic Alpha Collections via Reinforcement LearningCode3
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual TasksCode3
DeepFake-O-Meter v2.0: An Open Platform for DeepFake DetectionCode3
VAD: Vectorized Scene Representation for Efficient Autonomous DrivingCode3
Scaling Diffusion Transformers to 16 Billion ParametersCode3
Ola: Pushing the Frontiers of Omni-Modal Language ModelCode3
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation ImageryCode3
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot LearningCode3
Matcha-TTS: A fast TTS architecture with conditional flow matchingCode3
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation ModelsCode3
Decoding-based RegressionCode3
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task SynthesisCode3
Demystifying Long Chain-of-Thought Reasoning in LLMsCode3
MAXIM: Multi-Axis MLP for Image ProcessingCode3
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding BenchmarkCode3
Towards Automatic Power Battery Detection: New Challenge Benchmark Dataset and BaselineCode3
TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative DecodingCode3
IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian LanguagesCode3
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object DetectionCode3
Beyond Next-Token: Next-X Prediction for Autoregressive Visual GenerationCode3
A Survey on Mixture of ExpertsCode3
InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic ChartsCode3
Descriptive Image Quality Assessment in the WildCode3
Vision Transformer Adapter for Dense PredictionsCode3
UAV-VisLoc: A Large-scale Dataset for UAV Visual LocalizationCode3
DoWhy: Addressing Challenges in Expressing and Validating Causal AssumptionsCode3
GarmentCodeData: A Dataset of 3D Made-to-Measure Garments With Sewing PatternsCode3
Designing a Better Asymmetric VQGAN for StableDiffusionCode3
FinanceBench: A New Benchmark for Financial Question AnsweringCode3
Deep Learning for Free-Hand Sketch: A SurveyCode3
SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video SegmentationCode3
OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed DomainCode3
pyannote.audio: neural building blocks for speaker diarizationCode3
PyGDA: A Python Library for Graph Domain AdaptationCode3
o1-Coder: an o1 Replication for CodingCode3
Molecular Fingerprints Are Strong Models for Peptide Function PredictionCode3
In-situ graph reasoning and knowledge expansion using Graph-PReFLexORCode3
All You May Need for VQA are Image CaptionsCode3
YourBench: Easy Custom Evaluation Sets for EveryoneCode3
Show:102550
← PrevPage 69 of 3547Next →