SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 16511700 of 177339 papers

TitleStatusHype
Video-LLaVA: Learning United Visual Representation by Alignment Before ProjectionCode4
OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic KernelsCode4
Video Understanding with Large Language Models: A SurveyCode4
Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern LanguagesCode4
InstructIR: High-Quality Image Restoration Following Human InstructionsCode4
Lightweight Pixel Difference Networks for Efficient Visual Representation LearningCode4
AlphaFold Meets Flow Matching for Generating Protein EnsemblesCode4
ScreenAgent: A Vision Language Model-driven Computer Control AgentCode4
2D Matryoshka Sentence EmbeddingsCode4
The largest EEG-based BCI reproducibility study for open science: the MOABB benchmarkCode4
3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion PriorsCode4
SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian SplattingCode4
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language ModelsCode4
Long-CLIP: Unlocking the Long-Text Capability of CLIPCode4
RaDe-GS: Rasterizing Depth in Gaussian SplattingCode4
One-Step Effective Diffusion Network for Real-World Image Super-ResolutionCode4
On Scaling Up 3D Gaussian Splatting TrainingCode4
DiffusionDet: Diffusion Model for Object DetectionCode4
Evaluating Deep Regression Models for WSI-Based Gene-Expression PredictionCode4
LAMBDA: A Large Model Based Data AgentCode4
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and OpportunitiesCode4
StoryMaker: Towards Holistic Consistent Characters in Text-to-image GenerationCode4
Parameter Efficient Instruction Tuning: An Empirical StudyCode4
Identity-Preserving Text-to-Video Generation by Frequency DecompositionCode4
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at ScaleCode4
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall SpacesCode4
CodeI/O: Condensing Reasoning Patterns via Code Input-Output PredictionCode4
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLMCode4
Highly Accurate Dichotomous Image SegmentationCode4
Distill Any Depth: Distillation Creates a Stronger Monocular Depth EstimatorCode4
MIMIC-IT: Multi-Modal In-Context Instruction TuningCode4
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text GenerationCode4
Stop Overthinking: A Survey on Efficient Reasoning for Large Language ModelsCode4
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action ModelCode4
LSKNet: A Foundation Lightweight Backbone for Remote SensingCode4
Reflexion: Language Agents with Verbal Reinforcement LearningCode4
EmbodiedSAM: Online Segment Any 3D Thing in Real TimeCode4
Ming-Omni: A Unified Multimodal Model for Perception and GenerationCode4
Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language ModelsCode4
Enhance-A-Video: Better Generated Video for FreeCode4
OnPrem.LLM: A Privacy-Conscious Document Intelligence ToolkitCode4
Token Merging for Fast Stable DiffusionCode4
Agile But Safe: Learning Collision-Free High-Speed Legged LocomotionCode4
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPTCode4
PufferLib: Making Reinforcement Learning Libraries and Environments Play NiceCode4
Latent Swap Joint Diffusion for 2D Long-Form Latent GenerationCode4
Elucidating the Design Space of Diffusion-Based Generative ModelsCode4
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language modelsCode4
BitNet a4.8: 4-bit Activations for 1-bit LLMsCode4
A Survey on Vision-Language-Action Models for Embodied AICode4
Show:102550
← PrevPage 34 of 3547Next →