SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1275112800 of 474278 papers

TitleStatusHype
Seg-Wild: Interactive Segmentation based on 3D Gaussian Splatting for Unconstrained Image CollectionsCode1
MIRIX: Multi-Agent Memory System for LLM-Based Agents0
NLGCL: Naturally Existing Neighbor Layers Graph Contrastive Learning for RecommendationCode1
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene UnderstandingCode0
SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local PerturbationCode0
HiM2SAM: Enhancing SAM2 with Hierarchical Motion Estimation and Memory Optimization towards Long-term TrackingCode1
Goal-Oriented Sequential Bayesian Experimental Design for Causal Learning0
Objectomaly: Objectness-Aware Refinement for OoD Segmentation with Structural Consistency and Boundary Precision0
MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation0
Rethinking Query-based Transformer for Continual Image SegmentationCode1
Towards Interpretable Time Series Foundation ModelsCode0
SCOOTER: A Human Evaluation Framework for Unrestricted Adversarial ExamplesCode0
PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and ConsistencyCode1
Towards High-Resolution 3D Anomaly Detection: A Scalable Dataset and Real-Time Framework for Subtle Industrial DefectsCode2
An Automated Classifier of Harmful Brain Activities for Clinical Usage Based on a Vision-Inspired Pre-trained Framework0
Uncertainty Quantification for Motor Imagery BCI -- Machine Learning vs. Deep Learning0
GNN-CNN: An Efficient Hybrid Model of Convolutional and Graph Neural Networks for Text RepresentationCode0
You Don't Bring Me Flowers: Mitigating Unwanted Recommendations Through Conformal Risk ControlCode0
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory0
Does Data Scaling Lead to Visual Compositional Generalization?Code0
Unlocking Thermal Aerial Imaging: Synthetic Enhancement of UAV DatasetsCode0
DICE: Data Influence Cascade in Decentralized Learning0
MST-Distill: Mixture of Specialized Teachers for Cross-Modal Knowledge DistillationCode0
Robust Multimodal Large Language Models Against Modality Conflict0
AdeptHEQ-FL: Adaptive Homomorphic Encryption for Federated Learning of Hybrid Classical-Quantum Models with Dynamic Layer Sparing0
Instance-Wise Monotonic Calibration by Constrained TransformationCode0
MADPOT: Medical Anomaly Detection with CLIP Adaptation and Partial Optimal TransportCode0
Omni-Fusion of Spatial and Spectral for Hyperspectral Image SegmentationCode0
MultiJustice: A Chinese Dataset for Multi-Party, Multi-Charge Legal PredictionCode0
MCCD: A Multi-Attribute Chinese Calligraphy Character Dataset Annotated with Script Styles, Dynasties, and CalligraphersCode0
Colors See Colors Ignore: Clothes Changing ReID with Color DisentanglementCode0
ViDove: A Translation Agent System with Multimodal Context and Memory-Augmented ReasoningCode0
Test-Time Scaling with Reflective Generative ModelCode0
Label-Efficient Chest X-ray Diagnosis via Partial CLIP AdaptationCode0
Text-promptable Object Counting via Quantity Awareness EnhancementCode0
DS@GT at CheckThat! 2025: Exploring Retrieval and Reranking Pipelines for Scientific Claim Source Retrieval on Social Media DiscourseCode0
Evaluating and Improving Robustness in Large Language Models: A Survey and Future DirectionsCode0
XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech CodecsCode0
Airway Segmentation Network for Enhanced Tubular Feature ExtractionCode0
Ambiguity-aware Point Cloud Segmentation by Adaptive Margin Contrastive LearningCode0
FlexGaussian: Flexible and Cost-Effective Training-Free Compression for 3D Gaussian SplattingCode0
Finetuning Vision-Language Models as OCR Systems for Low-Resource Languages: A Case Study of ManchuCode0
Bridging the Last Mile of Prediction: Enhancing Time Series Forecasting with Conditional Guided Flow Matching0
Orchestrator-Agent Trust: A Modular Agentic AI Visual Classification System with Trust-Aware Orchestration and RAG-Based ReasoningCode0
Rethinking Verification for LLM Code Generation: From Generation to TestingCode1
HVI-CIDNet+: Beyond Extreme Darkness for Low-Light Image EnhancementCode1
The Dark Side of LLMs Agent-based Attacks for Complete Computer Takeover0
Multi-Agent Retrieval-Augmented Framework for Evidence-Based Counterspeech Against Health Misinformation0
VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation0
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning0
Show:102550
← PrevPage 256 of 9486Next →