SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1600116050 of 474278 papers

TitleStatusHype
Pushing the Limits of Extreme Weather: Constructing Extreme Heatwave Storylines with Differentiable Climate ModelsCode0
Efficiency Robustness of Dynamic Deep Learning SystemsCode0
From Images to Insights: Explainable Biodiversity Monitoring with Plain Language Habitat ExplanationsCode0
MSTAR: Box-free Multi-query Scene Text Retrieval with Attention RecyclingCode0
IQE-CLIP: Instance-aware Query Embedding for Zero-/Few-shot Anomaly Detection in Medical DomainCode0
Transformer IMU Calibrator: Dynamic On-body IMU Calibration for Inertial Motion CaptureCode1
Automated Validation of Textual Constraints Against AutomationML via LLMs and SHACLCode0
A Study on Individual Spatiotemporal Activity Generation Method Using MCP-Enhanced Chain-of-Thought Large Language ModelsCode0
Constructing and Evaluating Declarative RAG Pipelines in PyTerrierCode1
Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMsCode2
Equivariant Neural Diffusion for Molecule GenerationCode0
AIR: Zero-shot Generative Model Adaptation with Iterative RefinementCode0
LightKG: Efficient Knowledge-Aware Recommendations with Simplified GNN ArchitectureCode0
Technical Report with Proofs for A Full Picture in Conformance Checking: Efficiently Summarizing All Optimal Alignments0
Advanced fraud detection using machine learning models: enhancing financial transaction security0
Uncertainty-Aware Deep Learning for Automated Skin Cancer Classification: A Comprehensive Evaluation0
Dense Associative Memory with Epanechnikov Energy0
Leveraging 6DoF Pose Foundation Models For Mapping Marine Sediment BurialCode0
Specification and Evaluation of Multi-Agent LLM Systems -- Prototype and Cybersecurity ApplicationsCode0
Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context InformationCode0
Predicting function of evolutionarily implausible DNA sequencesCode0
Prompts to Summaries: Zero-Shot Language-Guided Video Summarization0
PiPViT: Patch-based Visual Interpretable Prototypes for Retinal Image AnalysisCode0
On the role of non-linear latent features in bipartite generative neural networks0
Occlusion-Aware 3D Hand-Object Pose Estimation with Masked AutoEncoders0
M4V: Multi-Modal Mamba for Text-to-Video Generation0
OmniFluids: Unified Physics Pre-trained Modeling of Fluid Dynamics0
ME: Trigger Element Combination Backdoor Attack on Copyright Infringement0
Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry ChallengesCode0
System Identification Using Kolmogorov-Arnold Networks: A Case Study on Buck Converters0
ConTextTab: A Semantics-Aware Tabular In-Context LearnerCode2
ReconMOST: Multi-Layer Sea Temperature Reconstruction with Observations-Guided DiffusionCode0
Post-Training Quantization for Video Matting0
On feature selection in double-imbalanced data settings: a Random Forest approach0
SWDL: Stratum-Wise Difference Learning with Deep Laplacian Pyramid for Semi-Supervised 3D Intracranial Hemorrhage SegmentationCode0
Towards Understanding Bias in Synthetic Data for EvaluationCode0
Contrastive Matrix Completion with Denoising and Augmented Graph Views for Robust RecommendationCode0
ContextRefine-CLIP for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2025Code0
Using Language and Road Manuals to Inform Map Reconstruction for Autonomous Driving0
LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs0
MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices0
LRSLAM: Low-rank Representation of Signed Distance Fields in Dense Visual SLAM System0
DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers0
GigaVideo-1: Advancing Video Generation via Automatic Feedback with 4 GPU-Hours Fine-Tuning0
Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration0
WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models0
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMsCode2
Self-learning signal classifier for decameter coherent scatter radars0
Demonstrating Multi-Suction Item Picking at Scale via Multi-Modal Learning of Pick Success0
Macro Graph of Experts for Billion-Scale Multi-Task Recommendation0
Show:102550
← PrevPage 321 of 9486Next →