SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,313 code links4,818 tasks

Papers

Showing 18511875 of 177339 papers

TitleStatusHype
Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text RetrieversCode4
Mamba YOLO: A Simple Baseline for Object Detection with State Space ModelCode4
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model MeasurementsCode4
Compressible-composable NeRF via Rank-residual DecompositionCode4
Structured Pruning for Deep Convolutional Neural Networks: A surveyCode4
From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judgeCode4
AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing TasksCode4
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active AssistanceCode4
Orb: A Fast, Scalable Neural Network PotentialCode4
Spirit LM: Interleaved Spoken and Written Language ModelCode4
When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world EnvironmentsCode4
SuperCorrect: Supervising and Correcting Language Models with Error-Driven InsightsCode4
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBenchCode4
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion TokensCode4
Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades LaterCode4
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object DetectionCode4
TabM: Advancing Tabular Deep Learning with Parameter-Efficient EnsemblingCode4
INT2.1: Towards Fine-Tunable Quantized Large Language Models with Error Correction through Low-Rank AdaptationCode4
SegGPT: Segmenting Everything In ContextCode4
TinyLLaVA: A Framework of Small-scale Large Multimodal ModelsCode4
Building reliable sim driving agents by scaling self-playCode4
Follow-Your-Click: Open-domain Regional Image Animation via Short PromptsCode4
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNNCode4
SkyReels-A2: Compose Anything in Video Diffusion TransformersCode4
Croissant: A Metadata Format for ML-Ready DatasetsCode4
Show:102550
← PrevPage 75 of 7094Next →