The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 7051–7100 of 661570 papers

Title	Date	Tasks	Status	Hype
Improving Causal Reasoning in Large Language Models: A Survey	Oct 22, 2024	Decision MakingSurvey	CodeCode Available	2
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction	Oct 22, 2024		CodeCode Available	2
Literature Meets Data: A Synergistic Approach to Hypothesis Generation	Oct 22, 2024	Deception DetectionDecision Making	CodeCode Available	2
PAPILLON: Privacy Preservation from Internet-based and Local Language Model Ensembles	Oct 22, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories	Oct 22, 2024	Multivariate Time Series ForecastingTemporal Sequences	CodeCode Available	2
DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model	Oct 22, 2024	DecoderInstance Segmentation	CodeCode Available	2
Diffusion Transformer Policy	Oct 21, 2024	DenoisingVision-Language-Action	CodeCode Available	2
Mitigating Object Hallucination via Concentric Causal Attention	Oct 21, 2024	HallucinationObject	CodeCode Available	2
A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Models	Oct 21, 2024	Decision MakingMulti-agent Reinforcement Learning	CodeCode Available	2
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution	Oct 21, 2024	Allmodel	CodeCode Available	2
Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions Following	Oct 21, 2024	BenchmarkingInstruction Following	CodeCode Available	2
Integrating Reinforcement Learning with Foundation Models for Autonomous Robotics: Methods and Perspectives	Oct 21, 2024	Reinforcement Learning (RL)	CodeCode Available	2
Pantograph: A Machine-to-Machine Interaction Interface for Advanced Theorem Proving, High Level Reasoning, and Data Extraction in Lean 4	Oct 21, 2024	Automated Theorem Proving	CodeCode Available	2
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style	Oct 21, 2024	BenchmarkingLanguage Modeling	CodeCode Available	2
3D-GANTex: 3D Face Reconstruction with StyleGAN3-based Multi-View Images and 3DDFA based Mesh Generation	Oct 21, 2024	3D Face ReconstructionFace Reconstruction	CodeCode Available	2
Analysing the Residual Stream of Language Models Under Knowledge Conflicts	Oct 21, 2024		CodeCode Available	2
Improve Vision Language Model Chain-of-thought Reasoning	Oct 21, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Compute-Constrained Data Selection	Oct 21, 2024		CodeCode Available	2
CamI2V: Camera-Controlled Image-to-Video Diffusion Model	Oct 21, 2024		CodeCode Available	2
LLaVA-KD: A Framework of Distilling Multimodal Large Language Models	Oct 21, 2024		CodeCode Available	2
Reducing Hallucinations in Vision-Language Models via Latent Space Steering	Oct 21, 2024	Hallucination	CodeCode Available	2
Beyond Browsing: API-Based Web Agents	Oct 21, 2024		CodeCode Available	2
TIPS: Text-Image Pretraining with Spatial Awareness	Oct 21, 2024	Depth EstimationImage Captioning	CodeCode Available	2
RANSAC Back to SOTA: A Two-stage Consensus Filtering for Real-time 3D Registration	Oct 21, 2024	Point Cloud Registration	CodeCode Available	2
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering	Oct 21, 2024	Open-Domain Question AnsweringQuestion Answering	CodeCode Available	2
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration	Oct 20, 2024	AllComputational Efficiency	CodeCode Available	2
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation	Oct 19, 2024	AI AgentBenchmarking	CodeCode Available	2
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning	Oct 19, 2024	BenchmarkingMulti-agent Reinforcement Learning	CodeCode Available	2
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step	Oct 19, 2024	Conditional Image GenerationGPU	CodeCode Available	2
A Multimodal Vision Foundation Model for Clinical Dermatology	Oct 19, 2024	DiagnosticLesion Segmentation	CodeCode Available	2
DM-Codec: Distilling Multimodal Representations for Speech Tokenization	Oct 19, 2024	Self-Supervised LearningSpeech Tokenization	CodeCode Available	2
SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning	Oct 19, 2024	Image Generation	CodeCode Available	2
Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion	Oct 19, 2024	image-classificationImage Classification	CodeCode Available	2
Dynamic Factor Allocation Leveraging Regime-Switching Signals	Oct 18, 2024		CodeCode Available	2
Comparing Differentiable and Dynamic Ray Tracing: Introducing the Multipath Lifetime Map	Oct 18, 2024		CodeCode Available	2
Combining Hough Transform and Deep Learning Approaches to Reconstruct ECG Signals From Printouts	Oct 18, 2024	ECG Digitization	CodeCode Available	2
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning	Oct 18, 2024		CodeCode Available	2
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation	Oct 18, 2024	DisentanglementImage Generation	CodeCode Available	2
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities	Oct 18, 2024	Conditional Image GenerationImage Generation	CodeCode Available	2
AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios	Oct 18, 2024	Anomaly ClassificationAnomaly Detection	CodeCode Available	2
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning	Oct 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM Inference	Oct 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
CybORG++: An Enhanced Gym for the Development of Autonomous Cyber Agents	Oct 18, 2024		CodeCode Available	2
How to Evaluate Reward Models for RLHF	Oct 18, 2024		CodeCode Available	2
REEF: Representation Encoding Fingerprints for Large Language Models	Oct 18, 2024		CodeCode Available	2
Artificial Kuramoto Oscillatory Neurons	Oct 17, 2024	Adversarial RobustnessObject Discovery	CodeCode Available	2
On the Role of Attention Heads in Large Language Model Safety	Oct 17, 2024	AttributeLanguage Modeling	CodeCode Available	2
SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction	Oct 17, 2024	Quantization	CodeCode Available	2
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs	Oct 17, 2024		CodeCode Available	2
UniDrive: Towards Universal Driving Perception Across Camera Configurations	Oct 17, 2024	Autonomous Driving	CodeCode Available	2