SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 23012350 of 659983 papers

TitleStatusHype
SR-Nav: Spatial Relationships Matter for Zero-shot Object Goal NavigationCode0
HAViT: Historical Attention Vision TransformerCode0
MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-EvolutionCode0
HORNet: Task-Guided Frame Selection for Video Question Answering with Vision-Language ModelsCode0
DriftGuard: Mitigating Asynchronous Data Drift in Federated LearningCode0
Multi-Modal Building Change Detection for Large-Scale Small Changes: Benchmark and BaselineCode0
MIDST Challenge at SaTML 2025: Membership Inference over Diffusion-models-based Synthetic Tabular dataCode0
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene UnderstandingCode0
AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI AgentsCode0
A Multicenter Benchmark of Multiple Instance Learning Models for Lymphoma Subtyping from HE-stained Whole Slide ImagesCode0
Offline Materials Optimization with CliqueFlowmerCode0
Towards Onboard Continuous Change Detection for FloodsCode0
Farther the Shift, Sparser the Representation: Analyzing OOD Mechanisms in LLMsCode0
Cell-Type Prototype-Informed Neural Network for Gene Expression Estimation from Pathology ImagesCode0
Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology AwarenessCode0
AJAR: Adaptive Jailbreak Architecture for Red-teamingCode0
Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation TokensCode0
ELiC: Efficient LiDAR Geometry Compression via Cross-Bit-depth Feature Propagation and Bag-of-EncodersCode0
MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction1
ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents1
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing2
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation2
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer1
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans3
Infherno: End-to-end Agent-based FHIR Resource Synthesis from Free-form Clinical Notes1
Matryoshka Gaussian Splatting1
Safety is Non-Compositional: A Formal Framework for Capability-Based AI Systems0
TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning2
From Far and Near: Perceptual Evaluation of Crowd Representations Across Levels of Detail0
LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG ModelingCode0
Full waveform inversion method based on diffusion model0
CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News0
Memory Bear AI Memory Science Engine for Multimodal Affective Intelligence: A Technical Report0
On the Fragility of AI Agent Collusion0
STAC: Plug-and-Play Spatio-Temporal Aware Cache Compression for Streaming 3D Reconstruction0
AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwidth Collapse0
Efficient Visual Anomaly Detection at the Edge: Enabling Real-Time Industrial Inspection on Resource-Constrained Devices0
Remote Sensing Image Dehazing: A Systematic Review of Progress, Challenges, and Prospects0
Transparent Fragments Contour Estimation via Visual-Tactile Fusion for Autonomous Reassembly0
Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search0
Mathematical Modeling of Cancer-Bacterial Therapy: Analysis and Numerical Simulation via Physics-Informed Neural Networks0
Goedel-Code-Prover: Hierarchical Proof Search for Open State-of-the-Art Code Verification0
PAI: Fast, Accurate, and Full Benchmark Performance Projection with AI0
FalconBC: Flow matching for Amortized inference of Latent-CONditioned physiologic Boundary Conditions0
DriveVLM-RL: Neuroscience-Inspired Reinforcement Learning with Vision-Language Models for Safe and Deployable Autonomous Driving0
WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows0
TeachingCoach: A Fine-Tuned Scaffolding Chatbot for Instructional Guidance to Instructors0
How Psychological Learning Paradigms Shaped and Constrained Artificial Intelligence0
Computation-Utility-Privacy Tradeoffs in Bayesian Estimation0
Path-Constrained Mixture-of-Experts0
Show:102550
← PrevPage 47 of 13200Next →