SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 96019650 of 661570 papers

TitleStatusHype
Ant Colony Sampling with GFlowNets for Combinatorial OptimizationCode2
LISO: Lidar-only Self-Supervised 3D Object DetectionCode2
Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language ModelsCode2
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer ReviewsCode2
EarthLoc: Astronaut Photography Localization by Indexing Earth from SpaceCode2
Eliminating Warping Shakes for Unsupervised Online Video StitchingCode2
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?Code2
ERA-CoT: Improving Chain-of-Thought through Entity Relationship AnalysisCode2
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real SystemCode2
Probabilistic Contrastive Learning for Long-Tailed Visual RecognitionCode2
MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational PathologyCode2
RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-FeedbackCode2
CT2Rep: Automated Radiology Report Generation for 3D Medical ImagingCode2
The pitfalls of next-token predictionCode2
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge EnhancementCode2
DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal InconsistencyCode2
Poly Kernel Inception Network for Remote Sensing DetectionCode2
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion ModelsCode2
V_kD: Improving Knowledge Distillation using Orthogonal ProjectionsCode2
RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code CompletionCode2
SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object DetectionCode2
Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous DrivingCode2
KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking TechniquesCode2
Long-term Frame-Event Visual Tracking: Benchmark Dataset and BaselineCode2
S^2IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series ForecastingCode2
MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning ProcessCode2
A self-supervised CNN for image watermark removalCode2
RFWave: Multi-band Rectified Flow for Audio Waveform ReconstructionCode2
Audio-Synchronized Visual AnimationCode2
FedFMS: Exploring Federated Foundation Models for Medical Image SegmentationCode2
DualBEV: Unifying Dual View Transformation with Probabilistic CorrespondencesCode2
Advanced Millimeter-Wave Radar System for Real-Time Multiple-Human Tracking and Fall DetectionCode2
Frequency-Adaptive Dilated Convolution for Semantic SegmentationCode2
Arbitrary-Scale Point Cloud Upsampling by Voxel-Based Network with Latent Geometric-Consistent LearningCode2
GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLMCode2
Debiasing Multimodal Large Language ModelsCode2
StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion ModelsCode2
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion ModelsCode2
IsolateGPT: An Execution Isolation Architecture for LLM-Based Agentic SystemsCode2
Beyond MOT: Semantic Multi-Object TrackingCode2
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text RetrievalCode2
Tracking Meets LoRA: Faster Training, Larger Model, Stronger PerformanceCode2
XPSR: Cross-modal Priors for Diffusion-based Image Super-ResolutionCode2
Rethinking Transformers Pre-training for Multi-Spectral Satellite ImageryCode2
Face2Diffusion for Fast and Editable Face PersonalizationCode2
HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context InteractionCode2
BjTT: A Large-scale Multimodal Dataset for Traffic PredictionCode2
QAQ: Quality Adaptive Quantization for LLM KV CacheCode2
BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel ModelingCode2
LLMs in the Imaginarium: Tool Learning through Simulated Trial and ErrorCode2
Show:102550
← PrevPage 193 of 13232Next →