SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1095110975 of 177340 papers

TitleStatusHype
A Prompt-Based Knowledge Graph Foundation Model for Universal In-Context ReasoningCode2
Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven CyberattacksCode2
GroundingSuite: Measuring Complex Multi-Granular Pixel GroundingCode2
What Limits LLM-based Human Simulation: LLMs or Our Design?Code2
Zero-Shot Vision Encoder Grafting via LLM SurrogatesCode2
OpenGlue: Open Source Graph Neural Net Based Pipeline for Image MatchingCode2
Omni-Kernel Network for Image RestorationCode2
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View TransformationCode2
StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map ConstructionCode2
Trends, Applications, and Challenges in Human Attention ModellingCode2
MixFormerV2: Efficient Fully Transformer TrackingCode2
PartSTAD: 2D-to-3D Part Segmentation Task AdaptationCode2
Tri^2-plane: Thinking Head Avatar via Feature PyramidCode2
nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space ModelCode2
LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and DistillationCode2
Interpretable Pre-Trained Transformers for Heart Time-Series DataCode2
Multi-Class Road User Detection With 3+1D Radar in the View-of-Delft DatasetCode2
RigNet: Neural Rigging for Articulated CharactersCode2
Building Cooperative Embodied Agents Modularly with Large Language ModelsCode2
Pretrained Transformers for Text Ranking: BERT and BeyondCode2
Global Convergence and Generalization Bound of Gradient-Based Meta-Learning with Deep Neural NetsCode2
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question AnsweringCode2
Balanced MSE for Imbalanced Visual RegressionCode2
A Review of Safe Reinforcement Learning: Methods, Theory and ApplicationsCode2
A Unified Evaluation of Textual Backdoor Learning: Frameworks and BenchmarksCode2
Show:102550
← PrevPage 439 of 7094Next →