SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 20262050 of 661570 papers

TitleStatusHype
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMsCode4
Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical DomainCode4
Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in ChineseCode4
Cognitive Architectures for Language AgentsCode4
DiffBIR: Towards Blind Image Restoration with Generative Diffusion PriorCode4
Prompt2Model: Generating Deployable Models from Natural Language InstructionsCode4
A Survey on Large Language Model based Autonomous AgentsCode4
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent BehaviorsCode4
GPFL: Simultaneously Learning Global and Personalized Feature Information for Personalized Federated LearningCode4
ChatHaruhi: Reviving Anime Character in Reality via Large Language ModelCode4
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View SynthesisCode4
Graph of Thoughts: Solving Elaborate Problems with Large Language ModelsCode4
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised PretrainingCode4
OpenProteinSet: Training data for structural biology at scaleCode4
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language ModelsCode4
AgentBench: Evaluating LLMs as AgentsCode4
TOPIQ: A Top-down Approach from Semantics to Distortions for Image Quality AssessmentCode4
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language ModelsCode4
From Discrete Tokens to High-Fidelity Audio Using Multi-Band DiffusionCode4
LISA: Reasoning Segmentation via Large Language ModelCode4
Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language ModelsCode4
Effective Whole-body Pose Estimation with Two-stages DistillationCode4
Universal and Transferable Adversarial Attacks on Aligned Language ModelsCode4
Guaranteed Approximation Bounds for Mixed-Precision Neural OperatorsCode4
Turning Whisper into Real-Time Transcription SystemCode4
Show:102550
← PrevPage 82 of 26463Next →