SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 59265950 of 177340 papers

TitleStatusHype
Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene ReconstructionCode2
Segment This Thing: Foveated Tokenization for Efficient Point-Prompted SegmentationCode2
Boost 3D Reconstruction using Diffusion-based Monocular Camera CalibrationCode2
Counting-Stars: A Multi-evidence, Position-aware, and Scalable Benchmark for Evaluating Long-Context Large Language ModelsCode2
Flow-Guided Transformer for Video InpaintingCode2
DISC-MedLLM: Bridging General Large Language Models and Real-World Medical ConsultationCode2
A Survey on Multimodal Benchmarks: In the Era of Large AI ModelsCode2
SocialBench: Sociality Evaluation of Role-Playing Conversational AgentsCode2
Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model MechanismsCode2
Unrecognizable Yet Identifiable: Image Distortion with Preserved EmbeddingsCode2
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan WorldCode2
Masked Modeling for Self-supervised Representation Learning on Vision and BeyondCode2
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wildCode2
Neural Fields with Thermal Activations for Arbitrary-Scale Super-ResolutionCode2
Pengi: An Audio Language Model for Audio TasksCode2
EMO-SUPERB: An In-depth Look at Speech Emotion RecognitionCode2
Latent Neural Operator for Solving Forward and Inverse PDE ProblemsCode2
EVA3D: Compositional 3D Human Generation from 2D Image CollectionsCode2
Tightly-Coupled LiDAR-IMU-Wheel Odometry with Online Calibration of a Kinematic Model for Skid-Steering RobotsCode2
CMGAN: Conformer-Based Metric-GAN for Monaural Speech EnhancementCode2
Progressive Pretext Task Learning for Human Trajectory PredictionCode2
Natural Language Reinforcement LearningCode2
Deep Learning Based Automatic Modulation Recognition: Models, Datasets, and ChallengesCode2
Robust Human Matting via Semantic GuidanceCode2
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight GenerationCode2
Show:102550
← PrevPage 238 of 7094Next →