Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3376–3400 of 5548 papers

Title	Date	Tasks	Status
Coherent Feed Forward Quantum Neural Network	Feb 1, 2024	BenchmarkingDiagnostic	—Unverified
MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures	Feb 1, 2024	AnatomyBenchmarking	—Unverified
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data	Jan 31, 2024	BenchmarkingChange Detection	CodeCode Available
Benchmarking Sensitivity of Continual Graph Learning for Skeleton-Based Action Recognition	Jan 31, 2024	Action RecognitionBenchmarking	—Unverified
ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks	Jan 29, 2024	BenchmarkingCross-Lingual Transfer	CodeCode Available
Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA	Jan 29, 2024	BenchmarkingImage Comprehension	—Unverified
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models	Jan 28, 2024	BenchmarkingCode Generation	CodeCode Available
Benchmarking with MIMIC-IV, an irregular, spare clinical time series dataset	Jan 27, 2024	BenchmarkingTime Series	—Unverified
SAM-based instance segmentation models for the automation of structural damage detection	Jan 27, 2024	BenchmarkingInstance Segmentation	—Unverified
Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis	Jan 26, 2024	BenchmarkingSemantic Segmentation	—Unverified
Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs	Jan 26, 2024	BenchmarkingKnowledge Graphs	—Unverified
Automated legal reasoning with discretion to act using s(LAW)	Jan 25, 2024	BenchmarkingLegal Reasoning	—Unverified
TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images	Jan 25, 2024	BenchmarkingSegmentation	—Unverified
Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding	Jan 24, 2024	BenchmarkingLanguage Modeling	—Unverified
Benchmarking the Fairness of Image Upsampling Methods	Jan 24, 2024	BenchmarkingDiversity	CodeCode Available
LLpowershap: Logistic Loss-based Automated Shapley Values Feature Selection Method	Jan 23, 2024	BenchmarkingFairness	CodeCode Available
Deep Neural Network Benchmarks for Selective Classification	Jan 23, 2024	BenchmarkingClassification	CodeCode Available
What the Weight?! A Unified Framework for Zero-Shot Knowledge Composition	Jan 23, 2024	Benchmarking	CodeCode Available
Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trials	Jan 22, 2024	BenchmarkingSynthetic Data Generation	CodeCode Available
Data-Driven Target Localization: Benchmarking Gradient Descent Using the Cramer-Rao Bound	Jan 20, 2024	Benchmarking	—Unverified
Data Augmentation for Traffic Classification	Jan 19, 2024	BenchmarkingClassification	—Unverified
Harnessing Orthogonality to Train Low-Rank Neural Networks	Jan 16, 2024	Benchmarking	CodeCode Available
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription	Jan 16, 2024	Automatic Speech RecognitionBenchmarking	—Unverified
OpenDPD: An Open-Source End-to-End Learning & Benchmarking Framework for Wideband Power Amplifier Modeling and Digital Pre-Distortion	Jan 16, 2024	Benchmarking	—Unverified
Large Language Models are Null-Shot Learners	Jan 16, 2024	Arithmetic ReasoningBenchmarking	—Unverified

Show:10 25 50

← PrevPage 136 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified