SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1760117650 of 474278 papers

TitleStatusHype
Gumbel-max List Sampling for Distribution Coupling with Multiple Samples0
Efficient Robust Conformal Prediction via Lipschitz-Bounded NetworksCode0
Noninvasive precision modulation of high-level neural population activity via natural vision perturbationsCode0
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models0
MTPNet: Multi-Grained Target Perception for Unified Activity Cliff PredictionCode1
An SCMA Receiver for 6G NTN based on Multi-Task Learning0
Joint Beamforming and Integer User Association using a GNN with Gumbel-Softmax Reparameterizations0
MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet ParadigmCode9
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning0
LLM-First Search: Self-Guided Exploration of the Solution SpaceCode1
Demonstrations of Integrity Attacks in Multi-Agent Systems0
LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement LearningCode0
PixCell: A generative foundation model for digital histopathology images0
Reasoning or Overthinking: Evaluating Large Language Models on Financial Sentiment Analysis0
Adaptive Preconditioners Trigger Loss Spikes in Adam0
Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic ReasoningCode0
DM-SegNet: Dual-Mamba Architecture for 3D Medical Image Segmentation with Global Context Modeling0
SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and EditingCode0
Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented GenerationCode0
Counterfactual reasoning: an analysis of in-context emergenceCode0
DACN: Dual-Attention Convolutional Network for Hyperspectral Image Super-ResolutionCode0
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought ReasoningCode2
SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMsCode2
On the Comprehensibility of Multi-structured Financial Documents using LLMs and Pre-processing ToolsCode0
Can Foundation Models Generalise the Presentation Attack Detection Capabilities on ID Cards?0
TextVidBench: A Benchmark for Long Video Scene Text Understanding0
Neural Inverse Rendering from Propagating Light0
Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting0
ContentV: Efficient Training of Video Generation Models with Limited Compute0
Robustness as Architecture: Designing IQA Models to Withstand Adversarial Perturbations0
APVR: Hour-Level Long Video Understanding with Adaptive Pivot Visual Information Retrieval0
Bringing SAM to new heights: Leveraging elevation data for tree crown segmentation from drone imagery0
Multi-scale Image Super Resolution with a Single Auto-Regressive Model0
Interpretable Multimodal Framework for Human-Centered Street Assessment: Integrating Visual-Language Models for Perceptual Urban Diagnostics0
PATS: Proficiency-Aware Temporal Sampling for Multi-View Sports Skill Assessment0
Beyond Cropped Regions: New Benchmark and Corresponding Baseline for Chinese Scene Text Retrieval in Diverse Layouts0
Structure-Aware Radar-Camera Depth Estimation0
Point Cloud Segmentation of Agricultural Vehicles using 3D Gaussian Splatting0
UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting0
A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions0
FG 2025 TrustFAA: the First Workshop on Towards Trustworthy Facial Affect Analysis: Advancing Insights of Fairness, Explainability, and Safety (TrustFAA)0
DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models0
CIVET: Systematic Evaluation of Understanding in VLMs0
FRED: The Florence RGB-Event Drone Dataset0
Track Any Anomalous Object: A Granular Video Anomaly Detection Pipeline0
Vision-Based Autonomous MM-Wave Reflector Using ArUco-Driven Angle-of-Arrival Estimation0
EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?0
Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs0
Unleashing Hour-Scale Video Training for Long Video-Language Understanding0
Refer to Anything with Vision-Language Prompts0
Show:102550
← PrevPage 353 of 9486Next →