Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4326–4350 of 5548 papers

Title	Date	Tasks	Status
Vision Transformer for Efficient Chest X-ray and Gastrointestinal Image Classification	Apr 23, 2023	BenchmarkingData Augmentation	—Unverified
Visual Attention on the Sun: What Do Existing Models Actually Predict?	Nov 25, 2018	BenchmarkingDeep Attention	—Unverified
Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding	May 15, 2025	BenchmarkingSemantic Communication	—Unverified
Visual Object Tracking on Multi-modal RGB-D Videos: A Review	Jan 23, 2022	BenchmarkingObject	—Unverified
Visual Place Recognition for Large-Scale UAV Applications	Jul 20, 2025	BenchmarkingDiversity	—Unverified
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare	Feb 19, 2025	BenchmarkingDiversity	—Unverified
VoiceWukong: Benchmarking Deepfake Voice Detection	Sep 10, 2024	BenchmarkingFace Swapping	—Unverified
V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning	Mar 14, 2025	BenchmarkingRelational Reasoning	—Unverified
v-SVR Polynomial Kernel for Predicting the Defect Density in New Software Projects	Dec 15, 2018	Benchmarkingregression	—Unverified
Vulnerability of Face Morphing Attacks: A Case Study on Lookalike and Identical Twins	Mar 24, 2023	BenchmarkingFace Recognition	—Unverified
From Attack to Protection: Leveraging Watermarking Attack Network for Advanced Add-on Watermarking	Aug 14, 2020	Benchmarking	—Unverified
Ward: Provable RAG Dataset Inference via LLM Watermarks	Oct 4, 2024	BenchmarkingRAG	—Unverified
Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation	Dec 12, 2023	BenchmarkingColumns Property Annotation	—Unverified
WebVision Challenge: Visual Learning and Understanding With Web Data	May 16, 2017	Benchmarkingimage-classification	—Unverified
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking	Nov 14, 2024	BenchmarkingDrug Discovery	—Unverified
WER We Stand: Benchmarking Urdu ASR Models	Sep 17, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
What can 5.17 billion regression fits tell us about artificial models of the human visual system?	Oct 12, 2021	Benchmarking	—Unverified
What cleaves? Is proteasomal cleavage prediction reaching a ceiling?	Oct 24, 2022	BenchmarkingDenoising	—Unverified
What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs	May 15, 2025	AllBenchmarking	—Unverified
What Emotions Make One or Five Stars? Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI	Feb 29, 2020	BenchmarkingBIG-bench Machine Learning	—Unverified
What if we had no Wikipedia? Domain-independent Term Extraction from a Large News Corpus	Sep 17, 2020	BenchmarkingTerm Extraction	—Unverified
Alexpaca: Learning Factual Clarification Question Generation Without Examples	Oct 17, 2023	BenchmarkingChatbot	—Unverified
What Motivates You? Benchmarking Automatic Detection of Basic Needs from Short Posts	Aug 1, 2021	BenchmarkingBinary Classification	—Unverified
Towards Self-adaptive Mutation in Evolutionary Multi-Objective Algorithms	Mar 8, 2023	BenchmarkingEvolutionary Algorithms	—Unverified
What Will it Take to Fix Benchmarking in Natural Language Understanding?	Apr 5, 2021	BenchmarkingNatural Language Understanding	—Unverified

Show:10 25 50

← PrevPage 174 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified