SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 21512200 of 659983 papers

TitleStatusHype
OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous DrivingCode4
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale KnowledgeCode4
GLIGEN: Open-Set Grounded Text-to-Image GenerationCode4
Simulation-free Schrödinger bridges via score and flow matchingCode4
Constitutional AI: Harmlessness from AI FeedbackCode4
Revisiting Self-Attentive Sequential RecommendationCode4
Aria Everyday Activities DatasetCode4
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement LearningCode4
Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVsCode4
A-MEM: Agentic Memory for LLM AgentsCode4
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language AnnotationsCode4
FILM: Frame Interpolation for Large MotionCode4
WorldVLA: Towards Autoregressive Action World ModelCode4
SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh RenderingCode4
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal InteractionCode4
AnyGPT: Unified Multimodal LLM with Discrete Sequence ModelingCode4
Open Problems in Applied Deep LearningCode4
ReAct: Synergizing Reasoning and Acting in Language ModelsCode4
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future DirectionsCode4
Diffusion Models for Medical Image Analysis: A Comprehensive SurveyCode4
LLM Maybe LongLM: Self-Extend LLM Context Window Without TuningCode4
Kolmogorov-Arnold Convolutions: Design Principles and Empirical StudiesCode4
ChatGPT for Robotics: Design Principles and Model AbilitiesCode4
An Entropy-based Text Watermarking Detection MethodCode4
RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation GenerationCode4
MINIMA: Modality Invariant Image MatchingCode4
SparseDrive: End-to-End Autonomous Driving via Sparse Scene RepresentationCode4
Tower: An Open Multilingual Large Language Model for Translation-Related TasksCode4
TrustLLM: Trustworthiness in Large Language ModelsCode4
Null-text Inversion for Editing Real Images using Guided Diffusion ModelsCode4
GriTS: Grid table similarity metric for table structure recognitionCode4
3D Scene Generation: A SurveyCode4
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN proverCode4
Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning PerformanceCode4
AgentBench: Evaluating LLMs as AgentsCode4
Semantic-SAM: Segment and Recognize Anything at Any GranularityCode4
4D Gaussian Splatting for Real-Time Dynamic Scene RenderingCode4
InstanceDiffusion: Instance-level Control for Image GenerationCode4
Depth Any Video with Scalable Synthetic DataCode4
TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic DataCode4
Quality-aware Masked Diffusion Transformer for Enhanced Music GenerationCode4
LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D DetectionCode4
Simple and Effective Masked Diffusion Language ModelsCode4
Sample-Efficient Alignment for LLMsCode4
PVUW 2024 Challenge on Complex Video Understanding: Methods and ResultsCode4
SeeSR: Towards Semantics-Aware Real-World Image Super-ResolutionCode4
Sparse Tensor-based Point Cloud Attribute CompressionCode4
WavCraft: Audio Editing and Generation with Large Language ModelsCode4
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and SoundCode4
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality CollaborationCode4
Show:102550
← PrevPage 44 of 13200Next →