SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 57515800 of 661570 papers

TitleStatusHype
Modifying Large Language Model Post-Training for Diverse Creative WritingCode2
Strong Baseline: Multi-UAV Tracking via YOLOv12 with BoT-SORT-ReIDCode2
Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion TransferCode2
Splat-LOAM: Gaussian Splatting LiDAR Odometry and MappingCode2
Dereflection Any Image with Diffusion Priors and Diversified DataCode2
Learning Multi-Level Features with Matryoshka Sparse AutoencodersCode2
CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application VulnerabilitiesCode2
Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language ModelsCode2
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-ImprovementCode2
MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic WorkflowCode2
Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT ImagesCode2
Instant Gaussian Stream: Fast and Generalizable Streaming of Dynamic Scene Reconstruction via Gaussian SplattingCode2
NuiScene: Exploring Efficient Generation of Unbounded Outdoor ScenesCode2
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial TokensCode2
Think or Not Think: A Study of Explicit Thinking in Rule-Based Visual Reinforcement Fine-TuningCode2
Ultra-Resolution Adaptation with EaseCode2
Single Image Iterative Subject-driven Generation and EditingCode2
IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D ScenesCode2
Mixture of Lookup ExpertsCode2
DnLUT: Ultra-Efficient Color Image Denoising via Channel-Aware Lookup TablesCode2
DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image UnderstandingCode2
SaMam: Style-aware State Space Model for Arbitrary Image Style TransferCode2
Bokehlicious: Photorealistic Bokeh Rendering with Controllable AperturesCode2
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame InterpolationCode2
Tokenize Image as a SetCode2
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow MatchingCode2
Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT DistillationCode2
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language ModelCode2
UniHDSA: A Unified Relation Prediction Approach for Hierarchical Document Structure AnalysisCode2
M3: 3D-Spatial MultiModal MemoryCode2
Rapid patient-specific neural networks for intraoperative X-ray to volume registrationCode2
The Change You Want To Detect: Semantic Change Detection In Earth Observation With Hybrid Data GenerationCode2
Aligning Information Capacity Between Vision and Language via Dense-to-Sparse Feature Distillation for Image-Text MatchingCode2
LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction TuningCode2
High-Order Control Barrier Functions: Insights and a Truncated Taylor-Based FormulationCode2
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for DermatologyCode2
VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-TuningCode2
DiffPortrait360: Consistent Portrait Diffusion for 360 View SynthesisCode2
Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM KernelsCode2
SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and EditingCode2
LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion ModelsCode2
DAPO: An Open-Source LLM Reinforcement Learning System at ScaleCode2
Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and PlanningCode2
Where do Large Vision-Language Models Look at when Answering Questions?Code2
Advances in 4D Generation: A SurveyCode2
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree TraversalCode2
LEGNet: Lightweight Edge-Gaussian Driven Network for Low-Quality Remote Sensing Image Object DetectionCode2
Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian SplattingCode2
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language ModelsCode2
PET-MAD, a universal interatomic potential for advanced materials modelingCode2
Show:102550
← PrevPage 116 of 13232Next →