SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 66016650 of 661570 papers

TitleStatusHype
OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding DistillationCode2
Auto-Regressive Moving Diffusion Models for Time Series ForecastingCode2
Elevating Flow-Guided Video Inpainting with Reference GenerationCode2
Phi-4 Technical ReportCode2
MaskTerial: A Foundation Model for Automated 2D Material Flake DetectionCode2
MAC-Ego3D: Multi-Agent Gaussian Consensus for Real-Time Collaborative Ego-Motion and Photorealistic 3D ReconstructionCode2
Owl-1: Omni World Model for Consistent Long Video GenerationCode2
Doe-1: Closed-Loop Autonomous Driving with Large World ModelCode2
Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM ReasoningCode2
Diffusion-Enhanced Test-time Adaptation with Text and Image AugmentationCode2
MPAX: Mathematical Programming in JAXCode2
Foundational Large Language Models for Materials ResearchCode2
DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous DrivingCode2
Diffusion Predictive Control with ConstraintsCode2
Towards a Multimodal Large Language Model with Pixel-Level Insight for BiomedicineCode2
GPD-1: Generative Pre-training for DrivingCode2
Generate Any Scene: Evaluating and Improving Text-to-Vision Generation with Scene Graph ProgrammingCode2
Predicting Human Brain States with TransformerCode2
ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature EnhancementCode2
Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge DistillationCode2
BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image FusionCode2
SAFIRE: Segment Any Forged Image RegionCode2
LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural AnnotationsCode2
GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern GreekCode2
SegFace: Face Segmentation of Long-Tail ClassesCode2
Proactive Model Adaptation Against Concept Drift for Online Time Series ForecastingCode2
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline DataCode2
MAGE: A Multi-Agent Engine for Automated RTL Code GenerationCode2
Video Motion Transfer with Diffusion TransformersCode2
Exploring What Why and How: A Multifaceted Benchmark for Causation Understanding of Video AnomalyCode2
FlashRNN: Optimizing Traditional RNNs on Modern HardwareCode2
Maya: An Instruction Finetuned Multilingual Multimodal ModelCode2
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative ModelsCode2
BiMediX2: Bio-Medical EXpert LMM for Diverse Medical ModalitiesCode2
Granite GuardianCode2
Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing ImageryCode2
DriveMM: All-in-One Large Multimodal Model for Autonomous DrivingCode2
From an Image to a Scene: Learning to Imagine the World from a Million 360 VideosCode2
Bridging the Divide: Reconsidering Softmax and Linear AttentionCode2
Toward AI-Driven Digital Organism: Multiscale Foundation Models for Predicting, Simulating and Programming Biology at All LevelsCode2
How to Merge Your Multimodal Models Over Time?Code2
Tactile DreamFusion: Exploiting Tactile Sensing for 3D GenerationCode2
Retrieving Semantics from the Deep: an RAG Solution for Gesture SynthesisCode2
Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular VideoCode2
Proactive Agents for Multi-Turn Text-to-Image Generation Under UncertaintyCode2
Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any GranularityCode2
ManiSkill-HAB: A Benchmark for Low-Level Manipulation in Home Rearrangement TasksCode2
Splatter-360: Generalizable 360^ Gaussian Splatting for Wide-baseline Panoramic ImagesCode2
MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference OptimizationCode2
ProcessBench: Identifying Process Errors in Mathematical ReasoningCode2
Show:102550
← PrevPage 133 of 13232Next →