The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1876–1900 of 661570 papers

Title	Date	Tasks	Status	Hype
PointMamba: A Simple State Space Model for Point Cloud Analysis	Feb 16, 2024	GPUMamba	CodeCode Available	4
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models	Feb 16, 2024		CodeCode Available	4
Generative Representational Instruction Tuning	Feb 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
TIAViz: A Browser-based Visualization Tool for Computational Pathology Models	Feb 15, 2024	whole slide images	CodeCode Available	4
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset	Feb 15, 2024	Arithmetic ReasoningGSM8K	CodeCode Available	4
OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM	Feb 14, 2024	Medical Visual Question AnsweringQuestion Answering	CodeCode Available	4
DoRA: Weight-Decomposed Low-Rank Adaptation	Feb 14, 2024	parameter-efficient fine-tuning	CodeCode Available	4
G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering	Feb 12, 2024	Common Sense ReasoningGraph Classification	CodeCode Available	4
Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English	Feb 12, 2024		CodeCode Available	4
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models	Feb 12, 2024	HallucinationObject Localization	CodeCode Available	4
Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation	Feb 11, 2024	Cardiac SegmentationContrastive Learning	CodeCode Available	4
ScreenAgent: A Vision Language Model-driven Computer Control Agent	Feb 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA	Feb 9, 2024	Event DetectionHate Speech Detection	CodeCode Available	4
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning	Feb 9, 2024	Data AugmentationGSM8K	CodeCode Available	4
InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write	Feb 8, 2024	Derendering	CodeCode Available	4
MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis	Feb 8, 2024	AttributeConditional Text-to-Image Synthesis	CodeCode Available	4
Spirit LM: Interleaved Spoken and Written Language Model	Feb 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
You Only Need One Color Space: An Efficient Network for Low-light Image Enhancement	Feb 8, 2024	Image EnhancementLow-light Image Deblurring and Enhancement	CodeCode Available	4
AlphaFold Meets Flow Matching for Generating Protein Ensembles	Feb 7, 2024	Diversity	CodeCode Available	4
JAX-Fluids 2.0: Towards HPC for Differentiable CFD of Compressible Two-phase Flows	Feb 7, 2024	GPU	CodeCode Available	4
Amortized Planning with Large-Scale Transformers: A Case Study on Chess	Feb 7, 2024	Memorization	CodeCode Available	4
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation	Feb 7, 2024	Cardiac SegmentationComputational Efficiency	CodeCode Available	4
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks	Feb 6, 2024	Quantization	CodeCode Available	4
LESS: Selecting Influential Data for Targeted Instruction Tuning	Feb 6, 2024		CodeCode Available	4
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal	Feb 6, 2024	Red Teaming	CodeCode Available	4