SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 29513000 of 659983 papers

TitleStatusHype
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts ModelsCode3
FaceXFormer: A Unified Transformer for Facial AnalysisCode3
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation DatasetCode3
Vaporetto: Efficient Japanese Tokenization Based on Improved Pointwise Linear ClassificationCode3
HARDVS: Revisiting Human Activity Recognition with Dynamic Vision SensorsCode3
A Note on the Prediction-Powered BootstrapCode3
S-Graphs 2.0 -- A Hierarchical-Semantic Optimization and Loop Closure for SLAMCode3
AudioBench: A Universal Benchmark for Audio Large Language ModelsCode3
Alias-Free Generative Adversarial NetworksCode3
HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian SplattingCode3
Embodied CoT Distillation From LLM To Off-the-shelf AgentsCode3
MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force FieldsCode3
GLiREL -- Generalist Model for Zero-Shot Relation ExtractionCode3
ZIM: Zero-Shot Image Matting for AnythingCode3
ivis Dimensionality Reduction Framework for Biomacromolecular SimulationsCode3
Vulnerability Detection with Code Language Models: How Far Are We?Code3
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language ModelsCode3
Vision-LSTM: xLSTM as Generic Vision BackboneCode3
A Survey on Evaluation of Large Language ModelsCode3
Movie Gen: A Cast of Media Foundation ModelsCode3
Recent Advances on Machine Learning for Computational Fluid Dynamics: A SurveyCode3
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing CommunityCode3
Point Transformer V3: Simpler, Faster, StrongerCode3
OmniSQL: Synthesizing High-quality Text-to-SQL Data at ScaleCode3
OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained ModelsCode3
Scaling Diffusion Models to Real-World 3D LiDAR Scene CompletionCode3
Delay-penalized CTC implemented based on Finite State TransducerCode3
BlackMamba: Mixture of Experts for State-Space ModelsCode3
SM3Det: A Unified Model for Multi-Modal Remote Sensing Object DetectionCode3
Reinforcement Learning for Reasoning in Large Language Models with One Training ExampleCode3
OneChart: Purify the Chart Structural Extraction via One Auxiliary TokenCode3
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual VideosCode3
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language ModelsCode3
StyleShot: A Snapshot on Any StyleCode3
Theia: Distilling Diverse Vision Foundation Models for Robot LearningCode3
BigCodec: Pushing the Limits of Low-Bitrate Neural Speech CodecCode3
Generating Synergistic Formulaic Alpha Collections via Reinforcement LearningCode3
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual TasksCode3
DeepFake-O-Meter v2.0: An Open Platform for DeepFake DetectionCode3
VAD: Vectorized Scene Representation for Efficient Autonomous DrivingCode3
Scaling Diffusion Transformers to 16 Billion ParametersCode3
Ola: Pushing the Frontiers of Omni-Modal Language ModelCode3
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation ImageryCode3
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot LearningCode3
Matcha-TTS: A fast TTS architecture with conditional flow matchingCode3
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation ModelsCode3
Decoding-based RegressionCode3
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task SynthesisCode3
Demystifying Long Chain-of-Thought Reasoning in LLMsCode3
MAXIM: Multi-Axis MLP for Image ProcessingCode3
Show:102550
← PrevPage 60 of 13200Next →