SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1000110050 of 661570 papers

TitleStatusHype
Verif.ai: Towards an Open-Source Scientific Generative Question-Answering System with Referenced and Verifiable AnswersCode2
Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction FollowingCode2
On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model InferenceCode2
Debating with More Persuasive LLMs Leads to More Truthful AnswersCode2
CLIPZyme: Reaction-Conditioned Virtual Screening of EnzymesCode2
Dirichlet Flow Matching with Applications to DNA Sequence DesignCode2
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement LearningCode2
Time Series Diffusion in the Frequency DomainCode2
PLAPT: Protein-Ligand Binding Affinity Prediction Using Pretrained TransformersCode2
Learning to Route Among Specialized Experts for Zero-Shot GeneralizationCode2
Let Your Graph Do the Talking: Encoding Structured Data for LLMsCode2
Paralinguistics-Aware Speech-Empowered Large Language Models for Natural ConversationCode2
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular FusionCode2
Accurate LoRA-Finetuning Quantization of LLMs via Information RetentionCode2
Sandwiched Compression: Repurposing Standard Codecs with Neural Network WrappersCode2
Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion ModelsCode2
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMsCode2
Scalable Diffusion Models with State Space BackboneCode2
DiffSpeaker: Speech-Driven 3D Facial Animation with Diffusion TransformerCode2
How Well Can LLMs Negotiate? NegotiationArena Platform and AnalysisCode2
Mamba-ND: Selective State Space Modeling for Multi-Dimensional DataCode2
Can Large Language Model Agents Simulate Human Trust Behavior?Code2
InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph PriorCode2
Edu-ConvoKit: An Open-Source Library for Education Conversation DataCode2
BEBLID: Boosted efficient binary local image descriptorCode2
ConvLoRA and AdaBN based Domain Adaptation via Self-TrainingCode2
Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph TransformersCode2
A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied AgentsCode2
Data-efficient Large Vision Models through Sequential AutoregressionCode2
SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language ModelsCode2
A Survey on Domain Generalization for Medical Image AnalysisCode2
Towards Aligned Layout Generation via Diffusion Model with Aesthetic ConstraintsCode2
FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation ModelsCode2
Multi-Patch Prediction: Adapting LLMs for Time Series Representation LearningCode2
Closing the Gap Between SGP4 and High-Precision Propagation via Differentiable ProgrammingCode2
Pedagogical Alignment of Large Language ModelsCode2
Blue noise for diffusion modelsCode2
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent SpaceCode2
Universal Neural FunctionalsCode2
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language BenchmarkCode2
ScreenAI: A Vision-Language Model for UI and Infographics UnderstandingCode2
Hydra: Sequentially-Dependent Draft Heads for Medusa DecodingCode2
MolTC: Towards Molecular Relational Modeling In Language ModelsCode2
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model FeedbackCode2
AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based PoliciesCode2
Fine-Tuned Language Models Generate Stable Inorganic Materials as TextCode2
QuEST: Low-bit Diffusion Model Quantization via Efficient Selective FinetuningCode2
YOLOPoint Joint Keypoint and Object DetectionCode2
DySLIM: Dynamics Stable Learning by Invariant Measure for Chaotic SystemsCode2
Learning a Decision Tree Algorithm with TransformersCode2
Show:102550
← PrevPage 201 of 13232Next →