The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 20101–20150 of 474278 papers

Title	Date	Tasks	Status	Hype
AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios	Oct 25, 2024	BenchmarkingDiversity	CodeCode Available	1
Context-Based Visual-Language Place Recognition	Oct 25, 2024	Semantic SegmentationVisual Place Recognition	CodeCode Available	1
Unified Cross-Modal Image Synthesis with Hierarchical Mixture of Product-of-Experts	Oct 25, 2024	Image Generation	CodeCode Available	1
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression	Oct 25, 2024	Offline RLReinforcement Learning (RL)	CodeCode Available	1
FastPCI: Motion-Structure Guided Fast Point Cloud Frame Interpolation	Oct 25, 2024	Scene Flow Estimation	CodeCode Available	1
Enhancing Battery Storage Energy Arbitrage with Deep Reinforcement Learning and Time-Series Forecasting	Oct 25, 2024	Deep Reinforcement LearningTime Series	CodeCode Available	1
Flow Generator Matching	Oct 25, 2024		CodeCode Available	1
Beyond Point Annotation: A Weakly Supervised Network Guided by Multi-Level Labels Generated from Four-Point Annotation for Thyroid Nodule Segmentation in Ultrasound Image	Oct 25, 2024	Segmentation	CodeCode Available	1
Applying sparse autoencoders to unlearn knowledge in language models	Oct 25, 2024		CodeCode Available	1
Monge-Ampere Regularization for Learning Arbitrary Shapes from Point Clouds	Oct 24, 2024		CodeCode Available	1
C^2: Scalable Auto-Feedback for LLM-based Chart Generation	Oct 24, 2024	8kDiversity	CodeCode Available	1
Classifier Clustering and Feature Alignment for Federated Learning under Distributed Concept Drift	Oct 24, 2024	ClusteringFederated Learning	CodeCode Available	1
Wavetable Synthesis Using CVAE for Timbre Control Based on Semantic Label	Oct 24, 2024		CodeCode Available	1
KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing	Oct 24, 2024	GPU	CodeCode Available	1
Demystifying Large Language Models for Medicine: A Primer	Oct 24, 2024	FairnessPrompt Engineering	CodeCode Available	1
Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems	Oct 24, 2024	Mathematical Reasoning	CodeCode Available	1
Prototypical Hash Encoding for On-the-Fly Fine-Grained Category Discovery	Oct 24, 2024	Sensitivity	CodeCode Available	1
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations	Oct 24, 2024	Instruction FollowingQuestion Answering	CodeCode Available	1
Scale Propagation Network for Generalizable Depth Completion	Oct 24, 2024	Depth Completion	CodeCode Available	1
Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction	Oct 24, 2024	Representation LearningSentence	CodeCode Available	1
Delving into the Reversal Curse: How Far Can Large Language Models Generalize?	Oct 24, 2024	Multiple-choice	CodeCode Available	1
Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images	Oct 24, 2024	Depth EstimationImage Rescaling	CodeCode Available	1
You Only Look Around: Learning Illumination Invariant Feature for Low-light Object Detection	Oct 24, 2024	Objectobject-detection	CodeCode Available	1
CAMEL-Bench: A Comprehensive Arabic LMM Benchmark	Oct 24, 2024	document understandingVideo Understanding	CodeCode Available	1
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback	Oct 24, 2024		CodeCode Available	1
TEAM: Topological Evolution-aware Framework for Traffic Forecasting--Extended Version	Oct 24, 2024	Continual LearningTime Series	CodeCode Available	1
ODDN: Addressing Unpaired Data Challenges in Open-World Deepfake Detection on Online Social Networks	Oct 24, 2024	DeepFake DetectionFace Swapping	CodeCode Available	1
Large Language Models for Financial Aid in Financial Time-series Forecasting	Oct 24, 2024	Time SeriesTime Series Forecasting	CodeCode Available	1
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks	Oct 24, 2024	Video Understanding	CodeCode Available	1
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs	Oct 24, 2024	2kMachine Translation	CodeCode Available	1
Optimizing Edge Offloading Decisions for Object Detection	Oct 24, 2024	Objectobject-detection	CodeCode Available	1
BIFRÖST: 3D-Aware Image compositing with Language Instructions	Oct 24, 2024	counterfactualImage Harmonization	CodeCode Available	1
Infogent: An Agent-Based Framework for Web Information Aggregation	Oct 24, 2024	Navigate	CodeCode Available	1
End-to-end Training for Recommendation with Language-based User Profiles	Oct 24, 2024	Recommendation Systems	CodeCode Available	1
VECTOR: Velocity-Enhanced GRU Neural Network for Real-Time 3D UAV Trajectory Prediction	Oct 24, 2024	PositionPrediction	CodeCode Available	1
LOGO -- Long cOntext aliGnment via efficient preference Optimization	Oct 24, 2024	GPULanguage Modeling	CodeCode Available	1
WAFFLE: Finetuning Multi-Modal Model for Automated Front-End Development	Oct 24, 2024	Code GenerationSSIM	CodeCode Available	1
From Imitation to Introspection: Probing Self-Consciousness in Language Models	Oct 24, 2024		CodeCode Available	1
FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling	Oct 24, 2024	Image Generation	CodeCode Available	1
STTATTS: Unified Speech-To-Text And Text-To-Speech Model	Oct 24, 2024	Multi-Task Learningspeech-recognition	CodeCode Available	1
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design	Oct 24, 2024	Mixture-of-ExpertsMMLU	CodeCode Available	1
GCoder: Improving Large Language Model for Generalized Graph Problem Solving	Oct 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Large Language Models Reflect the Ideology of their Creators	Oct 24, 2024	Question AnsweringText Summarization	CodeCode Available	1
Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective	Oct 23, 2024	graph constructionKnowledge Graphs	CodeCode Available	1
Physics-informed Neural Networks for Functional Differential Equations: Cylindrical Approximation and Its Convergence Guarantees	Oct 23, 2024		CodeCode Available	1
WAGLE: Strategic Weight Attribution for Effective and Modular Unlearning in Large Language Models	Oct 23, 2024		CodeCode Available	1
Federated Transformer: Multi-Party Vertical Federated Learning on Practical Fuzzily Linked Data	Oct 23, 2024	Entity AlignmentFederated Learning	CodeCode Available	1
Cross-model Control: Improving Multiple Large Language Models in One-time Training	Oct 23, 2024	Instruction FollowingLanguage Modeling	CodeCode Available	1
DisenGCD: A Meta Multigraph-assisted Disentangled Graph Learning Framework for Cognitive Diagnosis	Oct 23, 2024	cognitive diagnosisDiagnostic	CodeCode Available	1
Att2CPC: Attention-Guided Lossy Attribute Compression of Point Clouds	Oct 23, 2024	Attribute	CodeCode Available	1