The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11451–11500 of 661570 papers

Title	Date	Tasks	Status	Hype
WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings	Jun 15, 2023	Navigate	CodeCode Available	2
LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models	Jun 15, 2023	HallucinationImage Captioning	CodeCode Available	2
QuadSwarm: A Modular Multi-Quadrotor Simulator for Deep Reinforcement Learning with Direct Thrust Control	Jun 15, 2023	CPUDeep Reinforcement Learning	CodeCode Available	2
CMMLU: Measuring massive multitask language understanding in Chinese	Jun 15, 2023	Large Language Model	CodeCode Available	2
PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs	Jun 15, 2023	Benchmarking	CodeCode Available	2
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis	Jun 15, 2023	Image GenerationPreference Mapping	CodeCode Available	2
Datasets and Benchmarks for Offline Safe Reinforcement Learning	Jun 15, 2023	Autonomous DrivingBenchmarking	CodeCode Available	2
SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving	Jun 15, 2023	3D Semantic Scene Completion3D Semantic Scene Completion from a single 2D image	CodeCode Available	2
Segment Any Point Cloud Sequences by Distilling Vision Foundation Models	Jun 15, 2023	Representation LearningTransfer Learning	CodeCode Available	2
2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection	Jun 15, 2023	Anomaly DetectionAnomaly Localization	CodeCode Available	2
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data	Jun 15, 2023		CodeCode Available	2
Fast Training of Diffusion Models with Masked Transformers	Jun 15, 2023	DecoderDenoising	CodeCode Available	2
LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting	Jun 14, 2023	Traffic Prediction	CodeCode Available	2
TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting	Jun 14, 2023	Multivariate Time Series ForecastingRepresentation Learning	CodeCode Available	2
TryOnDiffusion: A Tale of Two UNets	Jun 14, 2023	Virtual Try-on	CodeCode Available	2
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification	Jun 14, 2023	Graph structure learningimage-classification	CodeCode Available	2
MiniLLM: Knowledge Distillation of Large Language Models	Jun 14, 2023	Instruction FollowingKnowledge Distillation	CodeCode Available	2
Hidden Biases of End-to-End Driving Models	Jun 13, 2023	Autonomous DrivingBench2Drive	CodeCode Available	2
Parting with Misconceptions about Learning-based Vehicle Motion Planning	Jun 13, 2023	MisconceptionsMotion Planning	CodeCode Available	2
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning	Jun 13, 2023	AllDomain Generalization	CodeCode Available	2
XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models	Jun 13, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
Efficient 3D Semantic Segmentation with Superpoint Transformer	Jun 13, 2023	3D Semantic SegmentationGPU	CodeCode Available	2
Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models	Jun 13, 2023	Catalytic activity predictionChemical-Disease Interaction Extraction	CodeCode Available	2
Controlling Text-to-Image Diffusion by Orthogonal Finetuning	Jun 12, 2023		CodeCode Available	2
Scalable 3D Captioning with Pretrained Models	Jun 12, 2023	DescriptiveImage Captioning	CodeCode Available	2
Valley: Video Assistant with Large Language model Enhanced abilitY	Jun 12, 2023	Action RecognitionInstruction Following	CodeCode Available	2
The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation	Jun 12, 2023	Event Argument ExtractionEvent Detection	CodeCode Available	2
Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization	Jun 11, 2023		CodeCode Available	2
Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception	Jun 10, 2023	3D Object DetectionBenchmarking	CodeCode Available	2
TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials	Jun 10, 2023	Formation Energy	CodeCode Available	2
Mind2Web: Towards a Generalist Agent for the Web	Jun 9, 2023		CodeCode Available	2
DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds	Jun 9, 2023	3D Multi-Object Tracking3D Object Detection	CodeCode Available	2
SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers	Jun 9, 2023	Continual LearningContinual Semantic Segmentation	CodeCode Available	2
FasterViT: Fast Vision Transformers with Hierarchical Attention	Jun 9, 2023	Image Classificationobject-detection	CodeCode Available	2
Prodigy: An Expeditiously Adaptive Parameter-Free Learner	Jun 9, 2023		CodeCode Available	2
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding	Jun 8, 2023	DecoderMulti-Task Learning	CodeCode Available	2
ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases	Jun 8, 2023		CodeCode Available	2
Matting Anything	Jun 8, 2023	Image MattingReferring Image Matting	CodeCode Available	2
PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance	Jun 8, 2023	Conversational Question AnsweringLanguage Modeling	CodeCode Available	2
Prompt Injection attack against LLM-integrated Applications	Jun 8, 2023		CodeCode Available	2
PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization	Jun 8, 2023	Language ModellingLarge Language Model	CodeCode Available	2
StreetSurf: Extending Multi-view Implicit Surface Reconstruction to Street Views	Jun 8, 2023	Autonomous DrivingGPU	CodeCode Available	2
Does Image Anonymization Impact Computer Vision Training?	Jun 8, 2023	Face AnonymizationInstance Segmentation	CodeCode Available	2
RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit	Jun 8, 2023	Answer GenerationFact Checking	CodeCode Available	2
K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization	Jun 8, 2023	Language ModelingLanguage Modelling	CodeCode Available	2
ReliableSwap: Boosting General Face Swapping Via Reliable Supervision	Jun 8, 2023	Face ReenactmentFace Swapping	CodeCode Available	2
UCTB: An Urban Computing Tool Box for Building Spatiotemporal Prediction Services	Jun 7, 2023	Diversity	CodeCode Available	2
On the Reliability of Watermarks for Large Language Models	Jun 7, 2023		CodeCode Available	2
Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models	Jun 7, 2023	DiversityImage Generation	CodeCode Available	2
ModuleFormer: Modularity Emerges from Mixture-of-Experts	Jun 7, 2023	Language ModellingLightweight Deployment	CodeCode Available	2