The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 18151–18200 of 474278 papers

Title	Date	Tasks	Status	Hype
Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them	Jan 31, 2025		CodeCode Available	1
Simulation Streams: A Programming Paradigm for Controlling Large Language Models and Building Complex Systems with Generative AI	Jan 30, 2025		CodeCode Available	1
Large Language Models for Cryptocurrency Transaction Analysis: A Bitcoin Case Study	Jan 30, 2025		CodeCode Available	1
A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models	Jan 30, 2025	Out-of-Distribution DetectionOut of Distribution (OOD) Detection	CodeCode Available	1
A Cartesian Encoding Graph Neural Network for Crystal Structures Property Prediction: Application to Thermal Ellipsoid Estimation	Jan 30, 2025	ADP PredictionBand Gap	CodeCode Available	1
Beyond Message Passing: Neural Graph Pattern Machine	Jan 30, 2025	Graph ClassificationGraph Learning	CodeCode Available	1
Distillation-Driven Diffusion Model for Multi-Scale MRI Super-Resolution: Make 1.5T MRI Great Again	Jan 30, 2025	Super-Resolution	CodeCode Available	1
HSRMamba: Contextual Spatial-Spectral State Space Model for Single Image Hyperspectral Super-Resolution	Jan 30, 2025	Hyperspectral Image Super-ResolutionImage Super-Resolution	CodeCode Available	1
Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation	Jan 30, 2025	Safety Alignment	CodeCode Available	1
o3-mini vs DeepSeek-R1: Which One is Safer?	Jan 30, 2025	Code GenerationProgram Repair	CodeCode Available	1
Efficient Neural Theorem Proving via Fine-grained Proof Structure Analysis	Jan 30, 2025	Automated Theorem ProvingMath	CodeCode Available	1
How to Select Datapoints for Efficient Human Evaluation of NLG Models?	Jan 30, 2025	HumanEvalMachine Translation	CodeCode Available	1
RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects	Jan 30, 2025	counterfactualRAG	CodeCode Available	1
Accuracy and Robustness of Weight-Balancing Methods for Training PINNs	Jan 30, 2025		CodeCode Available	1
MatIR: A Hybrid Mamba-Transformer Image Restoration Model	Jan 30, 2025	Computational EfficiencyImage Inpainting	CodeCode Available	1
Wearanize+: A Multimodal Dataset for Evaluating Wearable Technologies in Sleep Research	Jan 30, 2025	ClassificationEEG	CodeCode Available	1
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training	Jan 30, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
Towards Making Flowchart Images Machine Interpretable	Jan 29, 2025	Code GenerationOptical Character Recognition (OCR)	CodeCode Available	1
2SSP: A Two-Stage Framework for Structured Pruning of LLMs	Jan 29, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
Yin-Yang: Developing Motifs With Long-Term Structure And Controllability	Jan 29, 2025		CodeCode Available	1
TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection	Jan 29, 2025	Autonomous Drivingobject-detection	CodeCode Available	1
Improving Your Model Ranking on Chatbot Arena by Vote Rigging	Jan 29, 2025	Chatbot	CodeCode Available	1
Image, Text, and Speech Data Augmentation using Multimodal LLMs for Deep Learning: A Survey	Jan 29, 2025	Data AugmentationImage Augmentation	CodeCode Available	1
Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning	Jan 29, 2025	continuous-controlContinuous Control	CodeCode Available	1
ContourFormer:Real-Time Contour-Based End-to-End Instance Segmentation Transformer	Jan 29, 2025	Instance SegmentationSegmentation	CodeCode Available	1
acoupi: An Open-Source Python Framework for Deploying Bioacoustic AI Models on Edge Devices	Jan 29, 2025		CodeCode Available	1
Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation	Jan 29, 2025	Open Vocabulary Semantic SegmentationOpen-Vocabulary Semantic Segmentation	CodeCode Available	1
RadioLLM: Introducing Large Language Model into Cognitive Radio via Hybrid Prompt and Token Reprogrammings	Jan 28, 2025	DenoisingDomain Generalization	CodeCode Available	1
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns	Jan 28, 2025	Adversarial AttackBenchmarking	CodeCode Available	1
Polyp-Gen: Realistic and Diverse Polyp Image Generation for Endoscopic Dataset Expansion	Jan 28, 2025	DiagnosticImage Generation	CodeCode Available	1
Can Transformers Learn Full Bayesian Inference in Context?	Jan 28, 2025	Bayesian InferenceIn-Context Learning	CodeCode Available	1
xJailbreak: Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking	Jan 28, 2025	Reinforcement Learning (RL)Safety Alignment	CodeCode Available	1
Bayesian Analyses of Structural Vector Autoregressions with Sign, Zero, and Narrative Restrictions Using the R Package bsvarSIGNs	Jan 28, 2025		CodeCode Available	1
RG-Attn: Radian Glue Attention for Multi-modality Multi-agent Cooperative Perception	Jan 28, 2025		CodeCode Available	1
SliceOcc: Indoor 3D Semantic Occupancy Prediction with Vertical Slice Representation	Jan 28, 2025	3D Semantic Occupancy PredictionAutonomous Driving	CodeCode Available	1
Dream to Drive with Predictive Individual World Model	Jan 28, 2025	Autonomous Drivingmodel	CodeCode Available	1
Growing the Efficient Frontier on Panel Trees	Jan 28, 2025		CodeCode Available	1
Ultra-high resolution multimodal MRI densely labelled holistic structural brain atlas	Jan 28, 2025	Anatomy	CodeCode Available	1
FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data	Jan 28, 2025	Natural Language InferenceSynthetic Data Generation	CodeCode Available	1
CascadeV: An Implementation of Wurstchen Architecture for Video Generation	Jan 28, 2025	2kVideo Generation	CodeCode Available	1
VeriFact: Verifying Facts in LLM-Generated Clinical Text with Electronic Health Records	Jan 28, 2025	Retrieval-augmented Generation	CodeCode Available	1
SWIFT: Mapping Sub-series with Wavelet Decomposition Improves Time Series Forecasting	Jan 27, 2025	Edge-computingTime Series	CodeCode Available	1
Multi-Objective Reinforcement Learning for Power Grid Topology Control	Jan 27, 2025	Multi-Objective Reinforcement Learningreinforcement-learning	CodeCode Available	1
Membership Inference Attacks Against Vision-Language Models	Jan 27, 2025	Inference AttackMembership Inference Attack	CodeCode Available	1
Return of the Encoder: Maximizing Parameter Efficiency for SLMs	Jan 27, 2025	Computational EfficiencyCPU	CodeCode Available	1
SPECIAL: Zero-shot Hyperspectral Image Classification With CLIP	Jan 27, 2025	ClassificationHyperspectral Image Classification	CodeCode Available	1
Harnessing Diverse Perspectives: A Multi-Agent Framework for Enhanced Error Detection in Knowledge Graphs	Jan 27, 2025	Decision MakingKnowledge Graphs	CodeCode Available	1
Atla Selene Mini: A General Purpose Evaluation Model	Jan 27, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation	Jan 27, 2025	Brain Tumor SegmentationData Augmentation	CodeCode Available	1
SeqSeg: Learning Local Segments for Automatic Vascular Model Construction	Jan 27, 2025	Image SegmentationMedical Image Segmentation	CodeCode Available	1