Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4151–4175 of 5548 papers

Title	Date	Tasks	Status
The Unconstrained Ear Recognition Challenge	Aug 23, 2017	BenchmarkingPerson Recognition	—Unverified
The Unconstrained Ear Recognition Challenge 2019 - ArXiv Version With Appendix	Mar 11, 2019	BenchmarkingPerson Recognition	—Unverified
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models	Apr 17, 2025	BenchmarkingMath	—Unverified
TIIF-Bench: How Does Your T2I Model Follow Your Instructions?	Jun 2, 2025	BenchmarkingInstruction Following	—Unverified
Time and Tokens: Benchmarking End-to-End Speech Dysfluency Detection	Sep 20, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Time Awareness in Large Language Models: Benchmarking Fact Recall Across Time	Sep 20, 2024	BenchmarkingWorld Knowledge	—Unverified
Time Sensitive Knowledge Editing through Efficient Finetuning	Jun 6, 2024	Benchmarkingknowledge editing	—Unverified
TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs	Mar 13, 2025	BenchmarkingQuestion Answering	—Unverified
Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines	Feb 21, 2023	Benchmarkingwhole slide images	—Unverified
Timing Excess Returns A cross-universe approach to alpha	Feb 11, 2020	BenchmarkingTime Series	—Unverified
TinyML Platforms Benchmarking	Nov 30, 2021	Benchmarking	—Unverified
Title2Event: Benchmarking Open Event Extraction with a Large-scale Chinese Title Dataset	Nov 2, 2022	BenchmarkingEvent Extraction	—Unverified
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking	Feb 16, 2025	Benchmarking	—Unverified
tmVar 3.0: an improved variant concept recognition and normalization tool	Apr 7, 2022	Benchmarking	—Unverified
Token Sequence Compression for Efficient Multimodal Computing	Apr 24, 2025	Benchmarking	—Unverified
Top-k Regularization for Supervised Feature Selection	Jun 4, 2021	Benchmarkingfeature selection	—Unverified
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection	Aug 23, 2024	BenchmarkingBinary Classification	—Unverified
Totally Corrective Boosting with Cardinality Penalization	Apr 7, 2015	BenchmarkingCombinatorial Optimization	—Unverified
TOTOPO: Classifying univariate and multivariate time series with Topological Data Analysis	Oct 10, 2020	BenchmarkingTime Series	—Unverified
Toward an ImageNet Library of Functions for Global Optimization Benchmarking	Jun 27, 2022	Benchmarkingglobal-optimization	—Unverified
Toward end-to-end interpretable convolutional neural networks for waveform signals	May 3, 2024	BenchmarkingEmotion Recognition	—Unverified
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage	Dec 20, 2024	AttributeBenchmarking	—Unverified
Towards a Benchmark for Scientific Understanding in Humans and Machines	Apr 20, 2023	BenchmarkingInformation Retrieval	—Unverified
Towards a Human-Centred Cognitive Model of Visuospatial Complexity in Everyday Driving	May 29, 2020	Benchmarking	—Unverified
Towards a Multidimensional Evaluation Framework for Empathetic Conversational Systems	Jul 26, 2024	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 167 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified