SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1100111025 of 177340 papers

TitleStatusHype
A Comprehensive Study of Jailbreak Attack versus Defense for Large Language ModelsCode2
Saturn: Sample-efficient Generative Molecular Design using Memory ManipulationCode2
An Intelligent Agentic System for Complex Image Restoration ProblemsCode2
Multivariate Probabilistic Regression with Natural Gradient BoostingCode2
Brain-Computer-Interface controlled robot via RaspberryPi and PiEEGCode2
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language ModelsCode2
Learning to Generalize Provably in Learning to OptimizeCode2
Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial ScenariosCode2
AmpleGCG-Plus: A Strong Generative Model of Adversarial Suffixes to Jailbreak LLMs with Higher Success Rates in Fewer AttemptsCode2
DoTAT: A Domain-oriented Text Annotation ToolCode2
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language ModelsCode2
How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric LearningCode2
A Replication Study of Dense Passage RetrieverCode2
A simple way to make neural networks robust against diverse image corruptionsCode2
Drive Like a Human: Rethinking Autonomous Driving with Large Language ModelsCode2
Automatic Depression Detection: An Emotional Audio-Textual Corpus and a GRU/BiLSTM-based ModelCode2
XGen-7B Technical ReportCode2
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMsCode2
InPars Toolkit: A Unified and Reproducible Synthetic Data Generation Pipeline for Neural Information RetrievalCode2
Advancing Real-time Pandemic Forecasting Using Large Language Models: A COVID-19 Case StudyCode2
Image Super-Resolution using Efficient Striped Window TransformerCode2
Eureka: Evaluating and Understanding Large Foundation ModelsCode2
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation ModelsCode2
MMICL: Empowering Vision-language Model with Multi-Modal In-Context LearningCode2
Decoupling Static and Hierarchical Motion Perception for Referring Video SegmentationCode2
Show:102550
← PrevPage 441 of 7094Next →