SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1665116700 of 474278 papers

TitleStatusHype
Q-Adapt: Adapting LMM for Visual Quality Assessment with Progressive Instruction TuningCode1
Urban Computing in the Era of Large Language ModelsCode1
Efficient Constant-Space Multi-Vector RetrievalCode1
DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB ImageCode1
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM PretrainingCode1
Slow-Fast Architecture for Video Multi-Modal Large Language ModelsCode1
ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category DiscoveryCode1
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based AttacksCode1
GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical ReasoningCode1
BlenderGym: Benchmarking Foundational Model Systems for Graphics EditingCode1
Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization FailureCode1
Memory-efficient Low-latency Remote Photoplethysmography through Temporal-Spatial State Space DualityCode1
ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement LearningCode1
From Shadows to Safety: Occlusion Tracking and Risk Mitigation for Urban Autonomous DrivingCode1
Decoding Covert Speech from EEG Using a Functional Areas Spatio-Temporal TransformerCode1
InvFussion: Bridging Supervised and Zero-shot Diffusion for Inverse ProblemsCode1
Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured MeshesCode1
Quattro: Transformer-Accelerated Iterative Linear Quadratic Regulator Framework for Fast Trajectory OptimizationCode1
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?Code1
STPNet: Scale-aware Text Prompt Network for Medical Image SegmentationCode1
GSR4B: Biomass Map Super-Resolution with Sentinel-1/2 GuidanceCode1
v-CLR: View-Consistent Learning for Open-World Instance SegmentationCode1
FeatInsight: An Online ML Feature Management System on 4Paradigm Sage-Studio PlatformCode1
Probabilistically safe and efficient model-based Reinforcement LearningCode1
Robust LiDAR-Camera Calibration with 2D Gaussian SplattingCode1
MPCritic: A plug-and-play MPC architecture for reinforcement learningCode1
Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image SegmentationCode1
Near Field Localization via AI-Aided Subspace MethodsCode1
SeizureTransformer: Scaling U-Net with Transformer for Simultaneous Time-Step Level Seizure Detection from Long EEG RecordingsCode1
Automated Explanation of Machine Learning Models of Footballing Actions in WordsCode1
Effect-driven interpretation: Functors for natural language compositionCode1
A Doubly Decoupled Network for edge detectionCode1
Learning to Normalize on the SPD Manifold under Bures-Wasserstein GeometryCode1
LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language ModelsCode1
GLiNER-BioMed: A Suite of Efficient Models for Open Biomedical Named Entity RecognitionCode1
Flow Matching on Lie GroupsCode1
Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical DocumentsCode1
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time ComputeCode1
Improved Visual-Spatial Reasoning via R1-Zero-Like TrainingCode1
CellVTA: Enhancing Vision Foundation Models for Accurate Cell Segmentation and ClassificationCode1
SMILE: Infusing Spatial and Motion Semantics in Masked Video LearningCode1
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and QuantizationCode1
WikiVideo: Article Generation from Multiple VideosCode1
IMPACT: A Generic Semantic Loss for Multimodal Medical Image RegistrationCode1
It's a (Blind) Match! Towards Vision-Language Correspondence without Parallel DataCode1
MaintainCoder: Maintainable Code Generation Under Dynamic RequirementsCode1
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient OptimizationCode1
Towards Understanding How Knowledge Evolves in Large Vision-Language ModelsCode1
Times2D: Multi-Period Decomposition and Derivative Mapping for General Time Series ForecastingCode1
GenSwarm: Scalable Multi-Robot Code-Policy Generation and Deployment via Language ModelsCode1
Show:102550
← PrevPage 334 of 9486Next →