SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1700117050 of 474278 papers

TitleStatusHype
MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative ModelingCode1
Improving Adaptive Density Control for 3D Gaussian SplattingCode1
Empowering Smaller Models: Tuning LLaMA and Gemma with Chain-of-Thought for Ukrainian Exam TasksCode1
FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data ClassificationCode1
Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial-of-Service DetectionCode1
SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World ModelCode1
MamBEV: Enabling State Space Models to Learn Birds-Eye-View RepresentationsCode1
DIFFVSGG: Diffusion-Driven Online Video Scene Graph GenerationCode1
EEG-CLIP : Learning EEG representations from natural language descriptionsCode1
JuDGE: Benchmarking Judgment Document Generation for Chinese Legal SystemCode1
Robust Object Detection of Underwater Robot based on Domain GeneralizationCode1
RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation ServingCode1
State Space Model Meets Transformer: A New Paradigm for 3D Object DetectionCode1
MP-GUI: Modality Perception with MLLMs for GUI UnderstandingCode1
Make Your Training Flexible: Towards Deployment-Efficient Video ModelsCode1
AIGVE-Tool: AI-Generated Video Evaluation Toolkit with Multifaceted BenchmarkCode1
DPImageBench: A Unified Benchmark for Differentially Private Image SynthesisCode1
RoGSplat: Learning Robust Generalizable Human Gaussian Splatting from Sparse Multi-View ImagesCode1
A Comprehensive Survey on Cross-Domain Recommendation: Taxonomy, Progress, and ProspectsCode1
Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMCode1
Multi-Prototype Embedding Refinement for Semi-Supervised Medical Image SegmentationCode1
Inferring Event Descriptions from Time Series with Language ModelsCode1
Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future PerspectivesCode1
UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory NetworkCode1
Scale Efficient Training for Large DatasetsCode1
Mind the Gap: Confidence Discrepancy Can Guide Federated Semi-Supervised Learning Across Pseudo-MismatchCode1
Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light ScenariosCode1
TriLiteNet: Lightweight Model for Multi-Task Visual PerceptionCode1
Strain Problems got you in a Twist? Try StrainRelief: A Quantum-Accurate Tool for Ligand Strain CalculationsCode1
MSWAL: 3D Multi-class Segmentation of Whole Abdominal Lesions DatasetCode1
DPC: Dual-Prompt Collaboration for Tuning Vision-Language ModelsCode1
DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning PerspectiveCode1
SatDepth: A Novel Dataset for Satellite Image MatchingCode1
From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral PerspectiveCode1
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language ModelsCode1
An interpretable approach to automating the assessment of biofouling in video footageCode1
Sampling Innovation-Based Adaptive Compressive SensingCode1
Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric VideosCode1
A Multi-Power Law for Loss Curve Prediction Across Learning Rate SchedulesCode1
A General Adaptive Dual-level Weighting Mechanism for Remote Sensing PansharpeningCode1
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical InvestigationCode1
How Good is my Histopathology Vision-Language Foundation Model? A Holistic BenchmarkCode1
Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster InferenceCode1
Can Language Models Follow Multiple Turns of Entangled Instructions?Code1
Prompt Flow Integrity to Prevent Privilege Escalation in LLM AgentsCode1
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific ResearchCode1
FNSE-SBGAN: Far-field Speech Enhancement with Schrodinger Bridge and Generative Adversarial NetworksCode1
Grounded Chain-of-Thought for Multimodal Large Language ModelsCode1
UCF-Crime-DVS: A Novel Event-Based Dataset for Video Anomaly Detection with Spiking Neural NetworksCode1
FedVSR: Towards Model-Agnostic Federated Learning in Video Super-ResolutionCode1
Show:102550
← PrevPage 341 of 9486Next →