SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 21012125 of 661570 papers

TitleStatusHype
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and SocietyCode4
Expressive Whole-Body 3D Gaussian AvatarCode4
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text RecognitionCode4
SiamMask: A Framework for Fast Online Object Tracking and SegmentationCode4
RewardBench 2: Advancing Reward Model EvaluationCode4
VLN-R1: Vision-Language Navigation via Reinforcement Fine-TuningCode4
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and ManipulationCode4
The Era of 1-bit LLMs: All Large Language Models are in 1.58 BitsCode4
SAT: Dynamic Spatial Aptitude Training for Multimodal Language ModelsCode4
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RLCode4
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and GenerationCode4
Unified Reward Model for Multimodal Understanding and GenerationCode4
TorchRL: A data-driven decision-making library for PyTorchCode4
What Makes Good In-Context Examples for GPT-3?Code4
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language ModelsCode4
AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using SmartphonesCode4
TOFU: A Task of Fictitious Unlearning for LLMsCode4
Sundial: A Family of Highly Capable Time Series Foundation ModelsCode4
FP8 Formats for Deep LearningCode4
Gaussian Splatting SLAMCode4
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality TeachersCode4
Fairness Implications of Encoding Protected Categorical AttributesCode4
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language ModelsCode4
LISA++: An Improved Baseline for Reasoning Segmentation with Large Language ModelCode4
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attentionCode4
Show:102550
← PrevPage 85 of 26463Next →