The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4251–4300 of 661570 papers

Title	Date	Tasks	Status	Hype
White-Box Transformers via Sparse Rate Reduction	Jun 1, 2023	Representation Learning	CodeCode Available	3
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM	May 31, 2023		CodeCode Available	3
Humans in 4D: Reconstructing and Tracking Humans with Transformers	May 31, 2023	3D Human Pose EstimationAction Recognition	CodeCode Available	3
Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation	May 30, 2023	Machine TranslationSegmentation	CodeCode Available	3
LLM-QAT: Data-Free Quantization Aware Training for Large Language Models	May 29, 2023	Data Free QuantizationQuantization	CodeCode Available	3
Fine-Tuning Language Models with Just Forward Passes	May 27, 2023	GPUIn-Context Learning	CodeCode Available	3
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language Models	May 26, 2023	GSM8KMultimodal Reasoning	CodeCode Available	3
An end-to-end strategy for recovering a free-form potential from a snapshot of stellar coordinates	May 26, 2023	FormSymbolic Regression	CodeCode Available	3
Large Language Models as Tool Makers	May 26, 2023		CodeCode Available	3
Landmark Attention: Random-Access Infinite Context Length for Transformers	May 25, 2023	Retrieval	CodeCode Available	3
The False Promise of Imitating Proprietary LLMs	May 25, 2023	Language Modelling	CodeCode Available	3
Generating Synergistic Formulaic Alpha Collections via Reinforcement Learning	May 25, 2023	reinforcement-learningReinforcement Learning	CodeCode Available	3
RoMa: Robust Dense Feature Matching	May 24, 2023	Camera Pose EstimationDecoder	CodeCode Available	3
HuatuoGPT, towards Taming Language Model to Be a Doctor	May 24, 2023	Language ModelingLanguage Modelling	CodeCode Available	3
Hierarchical Prompting Assists Large Language Model on Web Navigation	May 23, 2023	Decision MakingLanguage Modeling	CodeCode Available	3
CGCE: A Chinese Generative Chat Evaluation Benchmark for General and Financial Domains	May 23, 2023	Text Generation	CodeCode Available	3
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia	May 23, 2023	ChatbotHallucination	CodeCode Available	3
Evaluation of the MACE Force Field Architecture: from Medicinal Chemistry to Materials Science	May 23, 2023		CodeCode Available	3
RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text	May 22, 2023	Language ModellingLarge Language Model	CodeCode Available	3
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback	May 22, 2023	Instruction Following	CodeCode Available	3
Prompting with Pseudo-Code Instructions	May 19, 2023		CodeCode Available	3
XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters	May 19, 2023		CodeCode Available	3
InstructIE: A Bilingual Instruction-based Information Extraction Dataset	May 19, 2023		CodeCode Available	3
Self-QA: Unsupervised Knowledge Guided Language Model Alignment	May 19, 2023	DiversityLanguage Modeling	CodeCode Available	3
LLM-Pruner: On the Structural Pruning of Large Language Models	May 19, 2023	Text Generationzero-shot-classification	CodeCode Available	3
Delay-penalized CTC implemented based on Finite State Transducer	May 19, 2023	Attribute	CodeCode Available	3
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities	May 18, 2023	Language ModelingLanguage Modelling	CodeCode Available	3
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities	May 18, 2023	1 Image, 2*2 StitchiAction Classification	CodeCode Available	3
Quantifying the robustness of deep multispectral segmentation models against natural perturbations and data poisoning	May 18, 2023	Adversarial RobustnessData Poisoning	CodeCode Available	3
Accelerating Transformer Inference for Translation via Parallel Decoding	May 17, 2023	Machine TranslationTranslation	CodeCode Available	3
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research	May 16, 2023	Philosophyreinforcement-learning	CodeCode Available	3
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification	May 16, 2023	DecoderLanguage Modeling	CodeCode Available	3
NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist	May 15, 2023	Controllable Language ModellingDialogue Generation	CodeCode Available	3
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models	May 15, 2023	Multiple-choice	CodeCode Available	3
A Comprehensive Survey on Segment Anything Model for Vision and Beyond	May 14, 2023		CodeCode Available	3
WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset	May 9, 2023	ArticlesImage Captioning	CodeCode Available	3
MultiModal-GPT: A Vision and Language Model for Dialogue with Humans	May 8, 2023	Instruction FollowingLanguage Modeling	CodeCode Available	3
PiML Toolbox for Interpretable Machine Learning Model Development and Diagnostics	May 7, 2023	FairnessInterpretable Machine Learning	CodeCode Available	3
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages	May 7, 2023	AttributeInstruction Following	CodeCode Available	3
Visual Causal Scene Refinement for Video Question Answering	May 7, 2023	Contrastive LearningQuestion Answering	CodeCode Available	3
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision	May 4, 2023	DiversityIn-Context Learning	CodeCode Available	3
Personalize Segment Anything Model with One Shot	May 4, 2023	Image Generationmodel	CodeCode Available	3
Panda LLM: Training Data and Evaluation for Open-Sourced Chinese Instruction-Following Large Language Models	May 4, 2023	Instruction Following	CodeCode Available	3
Caption Anything: Interactive Image Description with Diverse Multimodal Controls	May 4, 2023	controllable image captioningImage Captioning	CodeCode Available	3
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation	May 2, 2023	Code GenerationHumanEval	CodeCode Available	3
Unlimiformer: Long-Range Transformers with Unlimited Length Input	May 2, 2023	Book summarizationCPU	CodeCode Available	3
UCF: Uncovering Common Features for Generalizable Deepfake Detection	Apr 27, 2023	Binary ClassificationDecoder	CodeCode Available	3
LibCity: A Unified Library Towards Efficient and Comprehensive Urban Spatial-Temporal Prediction	Apr 27, 2023	Prediction	CodeCode Available	3
TorchBench: Benchmarking PyTorch with High API Surface Coverage	Apr 27, 2023	BenchmarkingGPU	CodeCode Available	3
Learning Neural PDE Solvers with Parameter-Guided Channel Attention	Apr 27, 2023	PDE Surrogate ModelingWeather Forecasting	CodeCode Available	3