SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 79267950 of 474278 papers

TitleStatusHype
The End of Manual Decoding: Towards Truly End-to-End Language Models0
DP-FedPGN: Finding Global Flat Minima for Differentially Private Federated Learning via Penalizing Gradient NormCode0
Eliciting Secret Knowledge from Language Models0
IGGT: Instance-Grounded Geometry Transformer for Semantic 3D Reconstruction0
E2Rank: Your Text Embedding can Also be an Effective and Efficient Listwise Reranker0
Object-IR: Leveraging Object Consistency and Mesh Deformation for Self-Supervised Image RetargetingCode0
HyperClick: Advancing Reliable GUI Grounding via Uncertainty Calibration0
Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals0
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences0
ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Use0
Sparse Model Inversion: Efficient Inversion of Vision Transformers for Data-Free ApplicationsCode0
Multilingual Political Views of Large Language Models: Identification and SteeringCode0
Adaptive Stochastic Coefficients for Accelerating Diffusion SamplingCode0
Ready to Translate, Not to Represent? Bias and Performance Gaps in Multilingual LLMs Across Language Families and DomainsCode0
Dynamic Gaussian Splatting from Defocused and Motion-blurred Monocular VideosCode0
DiagramEval: Evaluating LLM-Generated Diagrams via GraphsCode0
Aeolus: A Multi-structural Flight Delay DatasetCode0
MLPerf AutomotiveCode0
ZEBRA: Towards Zero-Shot Cross-Subject Generalization for Universal Brain Visual DecodingCode0
E-MMDiT: Revisiting Multimodal Diffusion Transformer Design for Fast Image Synthesis under Limited ResourcesCode0
MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language ModelsCode0
Fints: Efficient Inference-Time Personalization for LLMs with Fine-Grained Instance-Tailored SteeringCode0
DRAMA: Unifying Data Retrieval and Analysis for Open-Domain Analytic QueriesCode0
Trans-defense: Transformer-based Denoiser for Adversarial Defense with Spatial-Frequency Domain RepresentationCode0
A Technical Exploration of Causal Inference with Hybrid LLM Synthetic DataCode0
Show:102550
← PrevPage 318 of 18972Next →