SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 26012650 of 659983 papers

TitleStatusHype
Are EEG-to-Text Models Working?Code3
Verdict: A Library for Scaling Judge-Time ComputeCode3
Compact 3D Scene Representation via Self-Organizing Gaussian GridsCode3
StyleGAN-Human: A Data-Centric Odyssey of Human GenerationCode3
TiRex: Zero-Shot Forecasting Across Long and Short Horizons with Enhanced In-Context LearningCode3
Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in PythonCode3
CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-TuningCode3
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language ModelsCode3
MaxViT: Multi-Axis Vision TransformerCode3
A Survey of Large Language Models for GraphsCode3
SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based TrafficCode3
TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage ScenariosCode3
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language ModelsCode3
Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge SurveyCode3
Panza: Design and Analysis of a Fully-Local Personalized Text Writing AssistantCode3
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian SplattingCode3
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model LeaderboardsCode3
RFUAV: A Benchmark Dataset for Unmanned Aerial Vehicle Detection and IdentificationCode3
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted ConceptsCode3
Detect Anything 3D in the WildCode3
Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather ForecastCode3
Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone GenerationCode3
Unlimiformer: Long-Range Transformers with Unlimited Length InputCode3
emotion2vec: Self-Supervised Pre-Training for Speech Emotion RepresentationCode3
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and OptimizationCode3
Dataset Distillation with Neural Characteristic Function: A Minmax PerspectiveCode3
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code GenerationCode3
MTP: Advancing Remote Sensing Foundation Model via Multi-Task PretrainingCode3
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing ReasoningCode3
Towards Modern Image Manipulation Localization: A Large-Scale Dataset and Novel MethodsCode3
AGD: an Auto-switchable Optimizer using Stepwise Gradient Difference for Preconditioning MatrixCode3
Language-based Audio Moment RetrievalCode3
Unified Data Management and Comprehensive Performance Evaluation for Urban Spatial-Temporal Prediction [Experiment, Analysis & Benchmark]Code3
A Chinese Dataset for Evaluating the Safeguards in Large Language ModelsCode3
Multi-Modality Representation Learning for Antibody-Antigen Interactions PredictionCode3
Improving Alignment and Robustness with Circuit BreakersCode3
The OpenLAM ChallengesCode3
TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D EnvironmentsCode3
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and PlanningCode3
InstructIE: A Bilingual Instruction-based Information Extraction DatasetCode3
Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward ModelCode3
FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved OptimallyCode3
Vision-based 3D occupancy prediction in autonomous driving: a review and outlookCode3
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language ModelsCode3
PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360degCode3
Drone Data Analytics for Measuring Traffic Metrics at Intersections in High-Density AreasCode3
A Survey on Video Action Recognition in Sports: Datasets, Methods and ApplicationsCode3
UrbanGPT: Spatio-Temporal Large Language ModelsCode3
Direct Retrieval-augmented Optimization: Synergizing Knowledge Selection and Language ModelsCode3
ViTPose: Simple Vision Transformer Baselines for Human Pose EstimationCode3
Show:102550
← PrevPage 53 of 13200Next →