The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8226–8250 of 474278 papers

Title	Date	Tasks	Status	Hype
GraphKAN: Enhancing Feature Extraction with Graph Kolmogorov Arnold Networks	Jun 19, 2024	Kolmogorov-Arnold Networks	CodeCode Available	2
Rethinking Abdominal Organ Segmentation (RAOS) in the clinical scenario: A robustness evaluation benchmark with challenging cases	Jun 19, 2024	8kHallucination	CodeCode Available	2
Adaptable Logical Control for Large Language Models	Jun 19, 2024	MathText Generation	CodeCode Available	2
WATT: Weight Average Test-Time Adaptation of CLIP	Jun 19, 2024	image-classificationImage Classification	CodeCode Available	2
StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images	Jun 19, 2024	Object RecognitionScene Understanding	CodeCode Available	2
Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks	Jun 19, 2024	DecoderLanguage Modeling	CodeCode Available	2
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words	Jun 19, 2024	Dialogue Understanding	CodeCode Available	2
Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling	Jun 18, 2024	Arithmetic ReasoningLanguage Modeling	CodeCode Available	2
VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding	Jun 18, 2024	Image CaptioningQuestion Answering	CodeCode Available	2
Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment	Jun 18, 2024	Denoising	CodeCode Available	2
Can Go AIs be adversarially robust?	Jun 18, 2024	Diversity	CodeCode Available	2
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
AEM: Attention Entropy Maximization for Multiple Instance Learning based Whole Slide Image Classification	Jun 18, 2024	Diversityimage-classification	CodeCode Available	2
Coding Speech through Vocal Tract Kinematics	Jun 18, 2024	Voice Conversion	CodeCode Available	2
From Instance Training to Instruction Learning: Task Adapters Generation from Instructions	Jun 18, 2024	Knowledge Distillation	CodeCode Available	2
ChangeViT: Unleashing Plain Vision Transformers for Change Detection	Jun 18, 2024	Change Detection	CodeCode Available	2
GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models	Jun 18, 2024	BenchmarkingDepth Estimation	CodeCode Available	2
TroL: Traversal of Layers for Large Language and Vision Models	Jun 18, 2024	Visual Question Answering	CodeCode Available	2
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI	Jun 18, 2024	Benchmarkingscientific discovery	CodeCode Available	2
Automated MRI Quality Assessment of Brain T1-weighted MRI in Clinical Data Warehouses: A Transfer Learning Approach Relying on Artefact Simulation	Jun 18, 2024	Transfer Learning	CodeCode Available	2
Dissecting Adversarial Robustness of Multimodal LM Agents	Jun 18, 2024	Adversarial RobustnessAdversarial Text	CodeCode Available	2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving	Jun 18, 2024	Arithmetic ReasoningMath	CodeCode Available	2
Universal Score-based Speech Enhancement with High Content Preservation	Jun 18, 2024	Speech Enhancement	CodeCode Available	2
AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention	Jun 18, 2024	ObjectResponse Generation	CodeCode Available	2
AgentReview: Exploring Peer Review Dynamics with LLM Agents	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	2