The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 9951–10000 of 661570 papers

Title	Date	Tasks	Status	Hype
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding	Feb 14, 2024	ChatbotCode Generation	CodeCode Available	2
YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture Detection	Feb 14, 2024	Fracture detectionmedical image detection	CodeCode Available	2
Open-Vocabulary Segmentation with Unpaired Mask-Text Supervision	Feb 14, 2024	Language ModellingSegmentation	CodeCode Available	2
Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio	Feb 14, 2024	Audio ClassificationDecoder	CodeCode Available	2
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey	Feb 14, 2024	Survey	CodeCode Available	2
Generalized Portrait Quality Assessment	Feb 14, 2024	Face Image Quality Assessment	CodeCode Available	2
Extreme Video Compression with Pre-trained Diffusion Models	Feb 14, 2024	DecoderImage Compression	CodeCode Available	2
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents	Feb 14, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Instruction Tuning for Secure Code Generation	Feb 14, 2024	Code Generation	CodeCode Available	2
LLM-Enhanced User-Item Interactions: Leveraging Edge Information for Optimized Recommendations	Feb 14, 2024		CodeCode Available	2
BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image Segmentation	Feb 13, 2024	Image SegmentationMedical Image Segmentation	CodeCode Available	2
Learning to Produce Semi-dense Correspondences for Visual Localization	Feb 13, 2024	Camera Pose EstimationPose Estimation	CodeCode Available	2
DNABERT-S: Pioneering Species Differentiation with Species-Aware DNA Embeddings	Feb 13, 2024	Contrastive Learning	CodeCode Available	2
Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and Feedback	Feb 13, 2024	Video Synopsis	CodeCode Available	2
RBF-PINN: Non-Fourier Positional Embedding in Physics-Informed Neural Networks	Feb 13, 2024		CodeCode Available	2
An Embarrassingly Simple Approach for LLM with Strong ASR Capacity	Feb 13, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
Can LLMs Learn New Concepts Incrementally without Forgetting?	Feb 13, 2024	In-Context LearningIncremental Learning	CodeCode Available	2
Test-Time Backdoor Attacks on Multimodal Large Language Models	Feb 13, 2024	Backdoor Attack	CodeCode Available	2
Learning Continuous 3D Words for Text-to-Image Generation	Feb 13, 2024	Image GenerationText to Image Generation	CodeCode Available	2
Transductive Active Learning: Theory and Applications	Feb 13, 2024	Active LearningBayesian Optimization	CodeCode Available	2
LLaGA: Large Language and Graph Assistant	Feb 13, 2024		CodeCode Available	2
Translating Images to Road Network: A Sequence-to-Sequence Perspective	Feb 13, 2024		CodeCode Available	2
A Survey of Generative AI for de novo Drug Design: New Frontiers in Molecule and Protein Generation	Feb 13, 2024	Drug Design	CodeCode Available	2
InstructGraph: Boosting Large Language Models via Graph-centric Instruction Tuning and Preference Alignment	Feb 13, 2024	Hallucination	CodeCode Available	2
THE COLOSSEUM: A Benchmark for Evaluating Generalization for Robotic Manipulation	Feb 13, 2024	Robot Manipulation Generalization	CodeCode Available	2
LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents	Feb 13, 2024	BenchmarkingModel Selection	CodeCode Available	2
ChatCell: Facilitating Single-Cell Analysis with Natural Language	Feb 13, 2024		CodeCode Available	2
Higher Layers Need More LoRA Experts	Feb 13, 2024	Mixture-of-Experts	CodeCode Available	2
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast	Feb 13, 2024	Language ModellingLarge Language Model	CodeCode Available	2
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability	Feb 13, 2024	Text Generation	CodeCode Available	2
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data	Feb 13, 2024	Domain Generalization	CodeCode Available	2
One Train for Two Tasks: An Encrypted Traffic Classification Framework Using Supervised Contrastive Learning	Feb 12, 2024	ClassificationContrastive Learning	CodeCode Available	2
Mercury: A Code Efficiency Benchmark for Code Large Language Models	Feb 12, 2024	Code GenerationComputational Efficiency	CodeCode Available	2
Fairness Evaluation for Uplift Modeling in the Absence of Ground Truth	Feb 12, 2024	counterfactualDecision Making	CodeCode Available	2
Do Membership Inference Attacks Work on Large Language Models?	Feb 12, 2024	Membership Inference Attack	CodeCode Available	2
CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge	Feb 12, 2024	General KnowledgeMultiple-choice	CodeCode Available	2
Customizable Perturbation Synthesis for Robust SLAM Benchmarking	Feb 12, 2024	BenchmarkingSimultaneous Localization and Mapping	CodeCode Available	2
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension	Feb 12, 2024	2kAutomatic Speech Recognition	CodeCode Available	2
Cartesian atomic cluster expansion for machine learning interatomic potentials	Feb 12, 2024		CodeCode Available	2
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts	Feb 12, 2024	Continual PretrainingGSM8K	CodeCode Available	2
GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks	Feb 11, 2024	Graph Question AnsweringInstruction Following	CodeCode Available	2
KVQ: Kwai Video Quality Assessment for Short-form Videos	Feb 11, 2024	FormVideo Quality Assessment	CodeCode Available	2
ITINERA: Integrating Spatial Optimization with Large Language Models for Open-domain Urban Itinerary Planning	Feb 11, 2024	LLM real-life tasksOpen-Domain Question Answering	CodeCode Available	2
Feature Mapping in Physics-Informed Neural Networks (PINNs)	Feb 10, 2024	10-shot image generation	CodeCode Available	2
A Change Detection Reality Check	Feb 10, 2024	Change Detection	CodeCode Available	2
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators	Feb 10, 2024	Machine TranslationSpeech-to-Speech Translation	CodeCode Available	2
UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph Construction	Feb 10, 2024	graph constructionKnowledge Graph Completion	CodeCode Available	2
Neural SPH: Improved Neural Modeling of Lagrangian Fluid Dynamics	Feb 9, 2024		CodeCode Available	2
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning	Feb 9, 2024	Active LearningVideo Classification	CodeCode Available	2
Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following	Feb 9, 2024	Autonomous DrivingDenoising	CodeCode Available	2