SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1640116450 of 474278 papers

TitleStatusHype
AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly DetectionCode1
The Tenth NTIRE 2025 Image Denoising Challenge ReportCode1
SkeletonX: Data-Efficient Skeleton-based Action Recognition via Cross-sample Feature AggregationCode1
Evaluating the Goal-Directedness of Large Language ModelsCode1
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
Activated LoRA: Fine-tuned LLMs for IntrinsicsCode1
Climate-economy projections under shared socioeconomic pathways and net-zero scenariosCode1
DMM: Building a Versatile Image Generation Model via Distillation-Based Model MergingCode1
HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design TasksCode1
Progent: Programmable Privilege Control for LLM AgentsCode1
InjectLab: A Tactical Framework for Adversarial Threat Modeling Against Large Language ModelsCode1
The Hitchhiker's Guide to Program Analysis, Part II: Deep Thoughts by LLMsCode1
Search is All You Need for Few-shot Anomaly DetectionCode1
Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification ApproachCode1
GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene SupervisionCode1
Robust MPC for Uncertain Linear Systems -- Combining Model Adaptation and Iterative LearningCode1
MSCRS: Multi-modal Semantic Graph Prompt Learning Framework for Conversational Recommender SystemsCode1
Adaptive Decision Boundary for Few-Shot Class-Incremental LearningCode1
Deep Learning in Concealed Dense PredictionCode1
Deep Learning-based Bathymetry Retrieval without In-situ Depths using Remote Sensing Imagery and SfM-MVS DSMs with Data GapsCode1
Change State Space Models for Remote Sensing Change DetectionCode1
PraNet-V2: Dual-Supervised Reverse Attention for Medical Image SegmentationCode1
Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly DetectionsCode1
LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer ReviewsCode1
Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene CompletionCode1
Explicit and Implicit Representations in AI-based 3D Reconstruction for Radiology: A Systematic ReviewCode1
A Dual-Space Framework for General Knowledge Distillation of Large Language ModelsCode1
DRIFT open dataset: A drone-derived intelligence for traffic analysis in urban environmenCode1
Teaching Large Language Models to Reason through Learning and ForgettingCode1
R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt TuningCode1
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit GenerationCode1
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual PerceptionCode1
SafeSpeech: Robust and Universal Voice Protection Against Malicious Speech SynthesisCode1
MonoDiff9D: Monocular Category-Level 9D Object Pose Estimation via Diffusion ModelCode1
FLOSS: Free Lunch in Open-vocabulary Semantic SegmentationCode1
RealWebAssist: A Benchmark for Long-Horizon Web Assistance with Real-World UsersCode1
Efficient Process Reward Model Training via Active LearningCode1
EmbodiedAgent: A Scalable Hierarchical Approach to Overcome Practical Challenge in Multi-Robot ControlCode1
Towards Low-Latency Event-based Obstacle Avoidance on a FPGA-DroneCode1
Efficient Generative Model Training via Embedded Representation WarmupCode1
DNF-Avatar: Distilling Neural Fields for Real-time Animatable Avatar RelightingCode1
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image EditingCode1
DUE: A Deep Learning Framework and Library for Modeling Unknown EquationsCode1
Focus on Local: Finding Reliable Discriminative Regions for Visual Place RecognitionCode1
TinyverseGP: Towards a Modular Cross-domain Benchmarking Framework for Genetic ProgrammingCode1
SoccerNet-v3D: Leveraging Sports Broadcast Replays for 3D Scene UnderstandingCode1
Multimodal Long Video Modeling Based on Temporal Dynamic ContextCode1
TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language ModelsCode1
Attention GhostUNet++: Enhanced Segmentation of Adipose Tissue and Liver in CT ImagesCode1
M1: Towards Scalable Test-Time Compute with Mamba Reasoning ModelsCode1
Show:102550
← PrevPage 329 of 9486Next →