SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1060110625 of 177340 papers

TitleStatusHype
MARFT: Multi-Agent Reinforcement Fine-TuningCode2
DiSA: Diffusion Step Annealing in Autoregressive Image GenerationCode2
Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language ModelsCode2
TaskCraft: Automated Generation of Agentic TasksCode2
Audio synthesizer inversion in symmetric parameter spaces with approximately equivariant flow matchingCode2
LeanExplore: A search engine for Lean 4 declarationsCode2
Improving spliced alignment by modeling splice sites with deep learningCode2
any4: Learned 4-bit Numeric Representation for LLMsCode2
Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL TaskCode2
Session-based Social Recommendation via Dynamic Graph Attention NetworksCode2
Bag of Tricks and A Strong Baseline for Deep Person Re-identificationCode2
Measuring Coding Challenge Competence With APPSCode2
Learning Semantic Segmentation of Large-Scale Point Clouds with Random SamplingCode2
Learning To Describe Player Form in The MLBCode2
Learning Efficient Online 3D Bin Packing on Packing Configuration TreesCode2
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding SharingCode2
NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw ImagesCode2
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language ModelsCode2
ERS: a novel comprehensive endoscopy image dataset for machine learning, compliant with the MST 3.0 specificationCode2
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for DermatologyCode2
Cedille: A large autoregressive French language modelCode2
Iterative Corresponding Geometry: Fusing Region and Depth for Highly Efficient 3D Tracking of Textureless ObjectsCode2
ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical ReasoningCode2
scikit-fda: A Python Package for Functional Data AnalysisCode2
TopFormer: Token Pyramid Transformer for Mobile Semantic SegmentationCode2
Show:102550
← PrevPage 425 of 7094Next →