Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5451–5500 of 5548 papers

Title	Date	Tasks	Status
Fast, approximate kinetics of RNA folding	Jan 19, 2015	Benchmarking	—Unverified
Visual Fidelity Index for Generative Semantic Communications with Critical Information Embedding	May 15, 2025	BenchmarkingSemantic Communication	—Unverified
Technological Approaches to Detecting Online Disinformation and Manipulation	Aug 26, 2021	BenchmarkingFact Checking	—Unverified
FastDraft: How to Train Your Draft	Nov 17, 2024	BenchmarkingCode Completion	—Unverified
Fast Empirical Scenarios	Jul 8, 2023	BenchmarkingDecision Making	—Unverified
FastEnsemble: Benchmarking and Accelerating Ensemble-based Uncertainty Estimation for Image-to-Image Translation	Sep 29, 2021	BenchmarkingImage Generation	—Unverified
Can Foundation Models Really Segment Tumors? A Benchmarking Odyssey in Lung CT Imaging	May 2, 2025	BenchmarkingComputational Efficiency	—Unverified
Fast Labeling and Transcription with the Speechalyzer Toolkit	May 1, 2012	Audio ClassificationBenchmarking	—Unverified
TelcoLM: collecting data, adapting, and benchmarking language models for the telecommunication domain	Dec 20, 2024	Benchmarking	—Unverified
Fast Training of Deep Networks with One-Class CNNs	Jun 28, 2020	BenchmarkingClassification	—Unverified
Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate	May 22, 2023	BenchmarkingMath	—Unverified
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding	Mar 19, 2025	BenchmarkingMultiple-choice	—Unverified
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks	May 19, 2023	Benchmarking	—Unverified
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration	Dec 17, 2024	BenchmarkingFace Generation	—Unverified
Feasibility of BERT Embeddings For Domain-Specific Knowledge Mining	Jan 16, 2022	BenchmarkingLanguage Modelling	—Unverified
Cancer-Net PCa-Seg: Benchmarking Deep Learning Models for Prostate Cancer Segmentation Using Synthetic Correlated Diffusion Imaging	Jan 15, 2025	BenchmarkingComputational Efficiency	—Unverified
Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems	Jan 24, 2025	BenchmarkingDiversity	—Unverified
Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation	Nov 8, 2022	BenchmarkingRetrieval	—Unverified
Feature Encodings for Gradient Boosting with Automunge	Sep 25, 2022	BenchmarkingBinarization	—Unverified
AI-ready Snow Radar Echogram Dataset (SRED) for climate change monitoring	May 1, 2025	BenchmarkingDeep Learning	—Unverified
Featuremetric benchmarking: Quantum computer benchmarks based on circuit features	Apr 17, 2025	Benchmarking	—Unverified
Feature Selection and Classification of Hyperspectral Images With Support Vector Machines	Oct 15, 2007	BenchmarkingClassification	—Unverified
Feature selection in linear SVMs via a hard cardinality constraint: a scalable SDP decomposition approach	Apr 15, 2024	Benchmarkingfeature selection	—Unverified
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation	Feb 19, 2024	BenchmarkingChatbot	—Unverified
FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation	Jun 26, 2025	AttributeBenchmarking	—Unverified
FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data	Aug 8, 2024	Anomaly DetectionBenchmarking	—Unverified
Can Carbon-Aware Electric Load Shifting Reduce Emissions? An Equilibrium-Based Analysis	Apr 9, 2025	Benchmarking	—Unverified
Can AI Read Between The Lines? Benchmarking LLMs On Financial Nuance	May 22, 2025	BenchmarkingPrompt Engineering	—Unverified
Federated Deconfounding and Debiasing Learning for Out-of-Distribution Generalization	May 8, 2025	AttributeBenchmarking	—Unverified
Can AI Master Construction Management (CM)? Benchmarking State-of-the-Art Large Language Models on CM Certification Exams	Apr 4, 2025	BenchmarkingManagement	—Unverified
FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning	Sep 1, 2023	BenchmarkingFederated Learning	—Unverified
FedEval: A Holistic Evaluation Framework for Federated Learning	Nov 19, 2020	BenchmarkingFederated Learning	—Unverified
Can AI Freelancers Compete? Benchmarking Earnings, Reliability, and Task Success at Scale	May 16, 2025	BenchmarkingTAG	—Unverified
FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization	Jun 8, 2022	BenchmarkingFederated Learning	—Unverified
AI-Powered Cow Detection in Complex Farm Environments	Jan 3, 2025	Benchmarking	—Unverified
CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation	May 30, 2025	BenchmarkingMachine Translation	—Unverified
Temporal cross-validation impacts multivariate time series subsequence anomaly detection evaluation	Jun 13, 2025	Anomaly DetectionBenchmarking	—Unverified
Temporal Graphs Anomaly Emergence Detection: Benchmarking For Social Media Interactions	Jul 11, 2023	Anomaly DetectionBenchmarking	—Unverified
FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing Tasks	Jan 16, 2022	BenchmarkingFederated Learning	—Unverified
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography	Apr 14, 2025	BenchmarkingVisual Reasoning	—Unverified
FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms for Federated Learning	Oct 11, 2023	BenchmarkingDiversity	—Unverified
FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models	Jun 11, 2025	BenchmarkingFederated Learning	—Unverified
A Benchmark for Spray from Nearby Cutting Vehicles	Aug 24, 2021	Autonomous DrivingBenchmarking	—Unverified
CallNavi, A Challenge and Empirical Study on LLM Function Calling and Routing	Jan 9, 2025	BenchmarkingChatbot	—Unverified
FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge	Feb 14, 2017	BenchmarkingFacial Action Unit Detection	—Unverified
FER-C: Benchmarking Out-of-Distribution Soft Calibration for Facial Expression Recognition	Dec 16, 2023	BenchmarkingFacial Expression Recognition	—Unverified
Temporal Validity Change Prediction	Jan 1, 2024	BenchmarkingPrediction	—Unverified
Call for Action: towards the next generation of symbolic regression benchmark	May 6, 2025	BenchmarkingDiversity	—Unverified
FETCH: A Memory-Efficient Replay Approach for Continual Learning in Image Classification	Jul 17, 2024	BenchmarkingContinual Learning	—Unverified
Calibrating chemical multisensory devices for real world applications: An in-depth comparison of quantitative Machine Learning approaches	Aug 30, 2017	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 110 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified