SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1270112750 of 474278 papers

TitleStatusHype
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth StudyCode2
Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion ModelsCode2
Distributed Global Structure-from-Motion with a Deep Front-EndCode2
Solver-in-the-Loop: Learning from Differentiable Physics to Interact with Iterative PDE-SolversCode2
FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor ScenesCode2
Remote Bio-Sensing: Open Source Benchmark Framework for Fair Evaluation of rPPGCode2
Stream of Search (SoS): Learning to Search in LanguageCode2
TotalVibeSegmentator: Full Body MRI Segmentation for the NAKO and UK BiobankCode2
Tracking Anything in High QualityCode2
R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement LearningCode2
Cross Language Image Matching for Weakly Supervised Semantic SegmentationCode2
Discovering Latent Knowledge in Language Models Without SupervisionCode2
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise SearchCode2
DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species GenomeCode2
ProteinInvBench: Benchmarking Protein Inverse Folding on Diverse Tasks, Models, and MetricsCode2
Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model LearningCode2
Equivariant Graph Neural Operator for Modeling 3D DynamicsCode2
Positional Encoder Graph Quantile Neural Networks for Geographic DataCode2
Using the IBM Analog In-Memory Hardware Acceleration Kit for Neural Network Training and InferenceCode2
FloorSet -- a VLSI Floorplanning Dataset with Design Constraints of Real-World SoCsCode2
PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about ChangeCode2
SuperPoint-SLAM3: Augmenting ORB-SLAM3 with Deep Features, Adaptive NMS, and Learning-Based Loop ClosureCode2
Cross-modal Orthogonal High-rank Augmentation for RGB-Event Transformer-trackersCode2
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?Code2
Idiosyncrasies in Large Language ModelsCode2
Longitudinal Segmentation of MS Lesions via Temporal Difference WeightingCode2
Learning Robust Stereo Matching in the Wild with Selective Mixture-of-ExpertsCode2
ICASSP 2022 Acoustic Echo Cancellation ChallengeCode2
EASI-Tex: Edge-Aware Mesh Texturing from Single ImageCode2
Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion ModelsCode2
Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level Feature Fusion for Aiding Diagnosis of Blood DiseasesCode2
HCF-Net: Hierarchical Context Fusion Network for Infrared Small Object DetectionCode2
Attention-based CNN-LSTM and XGBoost hybrid model for stock predictionCode2
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTSCode2
Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?Code2
SFPNet: Sparse Focal Point Network for Semantic Segmentation on General LiDAR Point CloudsCode2
Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative ModelsCode2
GinAR: An End-To-End Multivariate Time Series Forecasting Model Suitable for Variable MissingCode2
FlowSE: Efficient and High-Quality Speech Enhancement via Flow MatchingCode2
EVOR: Evolving Retrieval for Code GenerationCode2
CenterFormer: Center-based Transformer for 3D Object DetectionCode2
Natural Language Fine-TuningCode2
Compression-Aware One-Step Diffusion Model for JPEG Artifact RemovalCode2
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific ProblemsCode2
Implicit Neural Representation in Medical Imaging: A Comparative SurveyCode2
LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and ModelsCode2
DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice ConversionCode2
Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge GraphCode2
Quantifying the Plausibility of Context Reliance in Neural Machine TranslationCode2
Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous DrivingCode2
Show:102550
← PrevPage 255 of 9486Next →