The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 6276–6300 of 474278 papers

Title	Date	Tasks	Status	Hype
Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge	Jan 23, 2025	SchedulingStreaming video understanding	CodeCode Available	2
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting	Jan 23, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
NUDT4MSTAR: A Large Dataset and Benchmark Towards Remote Sensing Object Recognition in the Wild	Jan 23, 2025	Earth ObservationObject Recognition	CodeCode Available	2
Parameter-Efficient Fine-Tuning for Foundation Models	Jan 23, 2025	parameter-efficient fine-tuningSurvey	CodeCode Available	2
YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID	Jan 23, 2025	Multi-Object Trackingobject-detection	CodeCode Available	2
GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing	Jan 23, 2025	4k	CodeCode Available	2
An Efficient Sparse Kernel Generator for O(3)-Equivariant Deep Networks	Jan 23, 2025	GPU	CodeCode Available	2
Tensor-Var: Variational Data Assimilation in Tensor Product Feature Space	Jan 23, 2025		CodeCode Available	2
GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting	Jan 23, 2025	3DGSNeRF	CodeCode Available	2
Querying Databases with Function Calling	Jan 23, 2025		CodeCode Available	2
PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection	Jan 23, 2025	object-detectionObject Detection	CodeCode Available	2
TimeFilter: Patch-Specific Spatial-Temporal Graph Filtration for Time Series Forecasting	Jan 22, 2025	ClusteringTime Series	CodeCode Available	2
Distillation Quantification for Large Language Models	Jan 22, 2025		CodeCode Available	2
Towards Robust Multi-tab Website Fingerprinting	Jan 22, 2025	Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION	CodeCode Available	2
GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting	Jan 22, 2025	Autonomous DrivingNeRF	CodeCode Available	2
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback	Jan 22, 2025	Instruction Following	CodeCode Available	2
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning	Jan 22, 2025	Mathematical Reasoning	CodeCode Available	2
A Survey on Multimodal Recommender Systems: Recent Advances and Future Directions	Jan 22, 2025	Recommendation Systems	CodeCode Available	2
Supervised Learning for Analog and RF Circuit Design: Benchmarks and Comparative Insights	Jan 21, 2025		CodeCode Available	2
MedS^3: Towards Medical Small Language Models with Self-Evolved Slow Thinking	Jan 21, 2025	Multiple-choice	CodeCode Available	2
Automating High Quality RT Planning at Scale	Jan 21, 2025		CodeCode Available	2
Episodic Memories Generation and Evaluation Benchmark for Large Language Models	Jan 21, 2025		CodeCode Available	2
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents	Jan 21, 2025	AttributeQuestion Answering	CodeCode Available	2
Exploring Temporally-Aware Features for Point Tracking	Jan 21, 2025	Point TrackingVideo Editing	CodeCode Available	2
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding	Jan 21, 2025	Video Understanding	CodeCode Available	2