Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2751–2775 of 5548 papers

Title	Date	Tasks	Status
Context-guided Triple Matching for Multiple Choice Question Answering	Jan 16, 2022	BenchmarkingMultiple-choice	—Unverified
Contextual Metric Meta-Evaluation by Measuring Local Metric Accuracy	Mar 25, 2025	Benchmarkingspeech-recognition	—Unverified
Exploring the Practicality of Generative Retrieval on Dynamic Corpora	May 27, 2023	BenchmarkingInformation Retrieval	—Unverified
Continuous Function Structured in Multilayer Perceptron for Global Optimization	Mar 7, 2023	Benchmarkingglobal-optimization	—Unverified
Continuous-Time Gaussian Process Motion-Compensation for Event-vision Pattern Tracking with Distance Fields	Mar 5, 2023	BenchmarkingMotion Compensation	—Unverified
Continuous U-Net: Faster, Greater and Noiseless	Feb 1, 2023	BenchmarkingDecoder	—Unverified
Contrastive Learning-Based Spectral Knowledge Distillation for Multi-Modality and Missing Modality Scenarios in Semantic Segmentation	Dec 4, 2023	BenchmarkingContrastive Learning	—Unverified
Contribution à l'Optimisation d'un Comportement Collectif pour un Groupe de Robots Autonomes	Jun 10, 2023	BenchmarkingDiversity	—Unverified
Contributions of the Petabyte Scale Sequence Search Codeathon toward efforts to scale sequence-based searches on SRA	May 9, 2025	Benchmarkingscientific discovery	—Unverified
ConvBench: A Comprehensive Benchmark for 2D Convolution Primitive Evaluation	Jul 15, 2024	Benchmarking	—Unverified
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments	Feb 27, 2025	BenchmarkingCode Generation	—Unverified
Convolutional and Deep Learning based techniques for Time Series Ordinal Classification	Jun 16, 2023	BenchmarkingOrdinal Classification	—Unverified
COPA: Comparing the Incomparable to Explore the Pareto Front	Mar 18, 2025	AutoMLBenchmarking	—Unverified
CORE: A Knowledge Graph Entity Type Prediction Method via Complex Space Regression and Embedding	Dec 19, 2021	BenchmarkingPrediction	—Unverified
CORE: Benchmarking LLMs Code Reasoning Capabilities through Static Analysis Tasks	Jul 3, 2025	BenchmarkingCode Generation	—Unverified
Cornac: A Comparative Framework for Multimodal Recommender Systems	May 8, 2020	BenchmarkingRecommendation Systems	—Unverified
COSET: A Benchmark for Evaluating Neural Program Embeddings	May 27, 2019	BenchmarkingGraph Neural Network	—Unverified
CoSy: Evaluating Textual Explanations of Neurons	May 30, 2024	Benchmarking	—Unverified
Countering Backdoor Attacks in Image Recognition: A Survey and Evaluation of Mitigation Strategies	Nov 17, 2024	Benchmarking	—Unverified
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts	Apr 14, 2025	BenchmarkingObject	—Unverified
Coupling volume-excluding compartment-based models of diffusion at different scales: Voronoi and pseudo-compartment approaches	May 24, 2016	BenchmarkingBlocking	—Unverified
Covariance Matrix Adaptation Evolution Strategy Assisted by Principal Component Analysis	May 8, 2021	BenchmarkingDimensionality Reduction	—Unverified
Creating a Data Collection for Evaluating Rich Speech Retrieval	May 1, 2012	BenchmarkingRetrieval	—Unverified
CRF-based Single-stage Acoustic Modeling with CTC Topology	Apr 16, 2019	BenchmarkingSpeech Recognition	—Unverified
CroCoDL: Cross-device Collaborative Dataset for Localization	Jan 1, 2025	BenchmarkingPose Estimation	—Unverified

Show:10 25 50

← PrevPage 111 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified