SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 26512700 of 177339 papers

TitleStatusHype
CausalVLR: A Toolbox and Benchmark for Visual-Linguistic Causal ReasoningCode3
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language ModelsCode3
MLZero: A Multi-Agent System for End-to-end Machine Learning AutomationCode3
Deformable DETR: Deformable Transformers for End-to-End Object DetectionCode3
VILA-U: a Unified Foundation Model Integrating Visual Understanding and GenerationCode3
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn'tCode3
Vine Copulas as Differentiable Computational GraphsCode3
Safe RLHF: Safe Reinforcement Learning from Human FeedbackCode3
Predicting from Strings: Language Model Embeddings for Bayesian OptimizationCode3
Discovering Language Model Behaviors with Model-Written EvaluationsCode3
A Survey of Camouflaged Object Detection and BeyondCode3
MCTrack: A Unified 3D Multi-Object Tracking Framework for Autonomous DrivingCode3
Trial and Error: Exploration-Based Trajectory Optimization for LLM AgentsCode3
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical CompetitionCode3
A Survey of Neural Code Intelligence: Paradigms, Advances and BeyondCode3
MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View StereoCode3
Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and VideoCode3
MyoSuite -- A contact-rich simulation suite for musculoskeletal motor controlCode3
Effects of charging and discharging capabilities on trade-offs between model accuracy and computational efficiency in pumped thermal electricity storageCode3
Evolving from Single-modal to Multi-modal Facial Deepfake Detection: A SurveyCode3
Towards Kinetic Manipulation of the Latent SpaceCode3
Medical SAM Adapter: Adapting Segment Anything Model for Medical Image SegmentationCode3
AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIPCode3
xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba CounterpartCode3
Open-Source Skull Reconstruction with MONAICode3
MMedAgent: Learning to Use Medical Tools with Multi-modal AgentCode3
DiarizationLM: Speaker Diarization Post-Processing with Large Language ModelsCode3
RelBench: A Benchmark for Deep Learning on Relational DatabasesCode3
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future DirectionsCode3
Learning Bipedal Walking On Planned Footsteps For Humanoid RobotsCode3
Large Language Monkeys: Scaling Inference Compute with Repeated SamplingCode3
ECG-FM: An Open Electrocardiogram Foundation ModelCode3
Hyper-YOLO: When Visual Object Detection Meets Hypergraph ComputationCode3
SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised LearningCode3
SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear ComplexityCode3
CAD-Recode: Reverse Engineering CAD Code from Point CloudsCode3
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-JudgeCode3
DeepfakeBench: A Comprehensive Benchmark of Deepfake DetectionCode3
FlowDock: Geometric Flow Matching for Generative Protein-Ligand Docking and Affinity PredictionCode3
LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation ModelsCode3
ImageFolder: Autoregressive Image Generation with Folded TokensCode3
ConsistI2V: Enhancing Visual Consistency for Image-to-Video GenerationCode3
Simple linear attention language models balance the recall-throughput tradeoffCode3
MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation SystemCode3
The Tabular Foundation Model TabPFN Outperforms Specialized Time Series Forecasting Models Based on Simple FeaturesCode3
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified FlowCode3
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window TransformerCode3
IMDL-BenCo: A Comprehensive Benchmark and Codebase for Image Manipulation Detection & LocalizationCode3
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens IntactCode3
Multi-agent Architecture Search via Agentic SupernetCode3
Show:102550
← PrevPage 54 of 3547Next →