SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 93019350 of 661570 papers

TitleStatusHype
SiMO: Single-Modality-Operable Multimodal Collaborative PerceptionCode0
ΔVLA: Prior-Guided Vision-Language-Action Models via World Knowledge VariationCode0
Spherical-GOF: Geometry-Aware Panoramic Gaussian Opacity Fields for 3D Scene ReconstructionCode0
Latent Speech-Text TransformerCode0
Infusion: Shaping Model Behavior by Editing Training Data via Influence FunctionsCode0
SoftJAX & SoftTorch: Empowering Automatic Differentiation Libraries with Informative GradientsCode0
BiCLIP: Domain Canonicalization via Structured Geometric TransformationCode0
FVG-PT: Adaptive Foreground View-Guided Prompt Tuning for Vision-Language ModelsCode0
RedSage: A Cybersecurity Generalist LLM1
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing1
VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?1
NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving1
HiAR: Efficient Autoregressive Long Video Generation via Hierarchical Denoising2
In-Context Reinforcement Learning for Tool Use in Large Language Models1
Adaptation of Agentic AI: A Survey of Post-Training, Memory, and Skills4
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs1
LatentMem: Customizing Latent Memory for Multi-Agent Systems1
\$OneMillion-Bench: How Far are Language Agents from Human Experts?1
WildActor: Unconstrained Identity-Preserving Video Generation2
OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning2
Beyond ReinMax: Low-Variance Gradient Estimators for Discrete Latent Variables0
Light of Normals: Unified Feature Representation for Universal Photometric Stereo3
SYNAPSE: Framework for Neuron Analysis and Perturbation in Sequence Encoding0
See and Switch: Vision-Based Branching for Interactive Robot-Skill Programming0
DualFlexKAN: Dual-stage Kolmogorov-Arnold Networks with Independent Function Control0
R2F: Repurposing Ray Frontiers for LLM-free Object Navigation0
Beyond Benchmarks: Dynamic, Automatic And Systematic Red-Teaming Agents For Trustworthy Medical Language Models0
Detecting AI-Generated Images via Diffusion Snap-Back Reconstruction: A Forensic Approach0
Are We Winning the Wrong Game? Revisiting Evaluation Practices for Long-Term Time Series Forecasting0
Power Couple? AI Growth and Renewable Energy Investment0
Generating Hierarchical JSON Representations of Scientific Sentences Using LLMs0
MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG0
Not All Pretraining are Created Equal: Threshold Tuning and Class Weighting for Imbalanced Polarization Tasks in Low-Resource Settings0
Beyond Hard Constraints: Budget-Conditioned Reachability For Safe Offline Reinforcement Learning0
Emergency Lane-Change Simulation: A Behavioral Guidance Approach for Risky Scenario Generation0
Writing literature reviews with AI: principles, hurdles and some lessons learned0
CDEoH: Category-Driven Automatic Algorithm Design With Large Language Models0
Beam-aware Kernelized Contextual Bandits for User Association and Beamforming in mmWave Vehicular Networks0
Generalized Stock Price Prediction for Multiple Stocks Combined with News Fusion0
Engineering Verifiable Modularity in Transformers via Per-Layer Supervision0
Quine: Realizing LLM Agents as Native POSIX Processes0
InfoMamba: An Attention-Free Hybrid Mamba-Transformer Model0
What on Earth is AlphaEarth? Hierarchical structure and functional interpretability for global land cover0
Did You Check the Right Pocket? Cost-Sensitive Store Routing for Memory-Augmented Agents0
Machine Learning Based Identification of Solvents from Post-Desiccation Patterns0
Attribution-Guided Model Rectification of Unreliable Neural Network Behaviors0
Local Precise Refinement: A Dual-Gated Mixture-of-Experts for Enhancing Foundation Model Generalization against Spectral Shifts0
Optimizing LLM Annotation of Classroom Discourse through Multi-Agent Orchestration0
Context-Enriched Natural Language Descriptions of Vessel Trajectories0
From Garbage to Gold: A Data-Architectural Theory of Predictive Robustness0
Show:102550
← PrevPage 187 of 13232Next →