SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 20512100 of 177339 papers

TitleStatusHype
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory TreeCode4
TotalSegmentator: robust segmentation of 104 anatomical structures in CT imagesCode4
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented TasksCode4
Benchmarking Neural Network Training AlgorithmsCode4
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global IlluminationCode4
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT ModulationCode4
Deepchecks: A Library for Testing and Validating Machine Learning Models and DataCode4
Effective Whole-body Pose Estimation with Two-stages DistillationCode4
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D AssetsCode4
The Importance of Directional Feedback for LLM-based OptimizersCode4
Rerender A Video: Zero-Shot Text-Guided Video-to-Video TranslationCode4
Theseus: A Library for Differentiable Nonlinear OptimizationCode4
SnAG: Scalable and Accurate Video GroundingCode4
From Discrete Tokens to High-Fidelity Audio Using Multi-Band DiffusionCode4
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer ModelsCode4
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual GuidanceCode4
Old Optimizer, New Norm: An AnthologyCode4
Time-LLM: Time Series Forecasting by Reprogramming Large Language ModelsCode4
The Llama 3 Herd of ModelsCode4
ControlVAE: Tuning, Analytical Properties, and Performance AnalysisCode4
UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2heightCode4
Diffusion Policy Policy OptimizationCode4
Scaling Granite Code Models to 128K ContextCode4
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and SocietyCode4
Expressive Whole-Body 3D Gaussian AvatarCode4
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text RecognitionCode4
SiamMask: A Framework for Fast Online Object Tracking and SegmentationCode4
RewardBench 2: Advancing Reward Model EvaluationCode4
VLN-R1: Vision-Language Navigation via Reinforcement Fine-TuningCode4
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and ManipulationCode4
The Era of 1-bit LLMs: All Large Language Models are in 1.58 BitsCode4
SAT: Dynamic Spatial Aptitude Training for Multimodal Language ModelsCode4
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RLCode4
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and GenerationCode4
Unified Reward Model for Multimodal Understanding and GenerationCode4
TorchRL: A data-driven decision-making library for PyTorchCode4
What Makes Good In-Context Examples for GPT-3?Code4
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language ModelsCode4
AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using SmartphonesCode4
TOFU: A Task of Fictitious Unlearning for LLMsCode4
Sundial: A Family of Highly Capable Time Series Foundation ModelsCode4
FP8 Formats for Deep LearningCode4
Gaussian Splatting SLAMCode4
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality TeachersCode4
Fairness Implications of Encoding Protected Categorical AttributesCode4
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language ModelsCode4
LISA++: An Improved Baseline for Reasoning Segmentation with Large Language ModelCode4
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attentionCode4
X^2-Gaussian: 4D Radiative Gaussian Splatting for Continuous-time Tomographic ReconstructionCode4
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and EditingCode4
Show:102550
← PrevPage 42 of 3547Next →