Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3401–3450 of 5548 papers

Title	Date	Tasks	Status
Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data	Jun 20, 2024	Animal Pose EstimationBenchmarking	—Unverified
TOTOPO: Classifying univariate and multivariate time series with Topological Data Analysis	Oct 10, 2020	BenchmarkingTime Series	—Unverified
LMFormer: Lane based Motion Prediction Transformer	Apr 14, 2025	Autonomous DrivingBenchmarking	—Unverified
Benchmarking Modern Named Entity Recognition Techniques for Free-text Health Record De-identification	Mar 25, 2021	BenchmarkingDecoder	—Unverified
LMME3DHF: Benchmarking and Evaluating Multimodal 3D Human Face Generation with LMMs	Apr 29, 2025	BenchmarkingFace Generation	—Unverified
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models	Jul 17, 2024	BenchmarkingLanguage Modelling	—Unverified
Load-independent Metrics for Benchmarking Force Controllers	May 13, 2025	Benchmarking	—Unverified
Benchmarking Mobile Device Control Agents across Diverse Configurations	Apr 25, 2024	BenchmarkingImitation Learning	—Unverified
Local Data Quantity-Aware Weighted Averaging for Federated Learning with Dishonest Clients	Apr 17, 2025	BenchmarkingFederated Learning	—Unverified
XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis	Jun 26, 2024	Autonomous DrivingBenchmarking	—Unverified
Ensuring Reliability of Curated EHR-Derived Data: The Validation of Accuracy for LLM/ML-Extracted Information and Data (VALID) Framework	Jun 9, 2025	BenchmarkingFairness	—Unverified
Benchmarking Middle-Trained Language Models for Neural Search	Jun 5, 2023	BenchmarkingLanguage Modeling	—Unverified
Logically at Factify 2: A Multi-Modal Fact Checking System Based on Evidence Retrieval techniques and Transformer Encoder Architecture	Jan 9, 2023	AvgBenchmarking	—Unverified
Logically at Factify 2022: Multimodal Fact Verification	Dec 16, 2021	BenchmarkingFact Checking	—Unverified
Toward an ImageNet Library of Functions for Global Optimization Benchmarking	Jun 27, 2022	Benchmarkingglobal-optimization	—Unverified
Benchmarking Meta-heuristic Optimization	Jul 27, 2020	BenchmarkingEvolutionary Algorithms	—Unverified
Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models	Jun 25, 2024	Benchmarking	—Unverified
Toward end-to-end interpretable convolutional neural networks for waveform signals	May 3, 2024	BenchmarkingEmotion Recognition	—Unverified
Benchmarking MedMNIST dataset on real quantum hardware	Feb 18, 2025	Benchmarkingimage-classification	—Unverified
Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets	Jun 1, 2015	BenchmarkingMachine Translation	—Unverified
Benchmarking Continuous Time Models for Predicting Multiple Sclerosis Progression	Feb 15, 2023	Benchmarking	—Unverified
Benchmarking Machine Learning Robustness in Covid-19 Spike Sequence Classification	Sep 29, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
Benchmarking Machine Learning Models to Predict Corporate Bankruptcy	Dec 22, 2022	Benchmarking	—Unverified
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation	Jan 9, 2025	2k8k	—Unverified
Long Range Arena : A Benchmark for Efficient Transformers	Jan 1, 2021	16kBenchmarking	—Unverified
Benchmarking machine learning models for predicting aerofoil performance	Apr 22, 2025	Benchmarking	—Unverified
Benchmarking Machine Learning Models for Quantum Error Correction	Nov 18, 2023	Benchmarking	—Unverified
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models	Feb 17, 2025	Benchmarking	—Unverified
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage	Dec 20, 2024	AttributeBenchmarking	—Unverified
Look, Read and Feel: Benchmarking Ads Understanding with Multimodal Multitask Learning	Dec 21, 2019	BenchmarkingPrediction	—Unverified
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking	Nov 14, 2024	BenchmarkingDrug Discovery	—Unverified
LOOPE: Learnable Optimal Patch Order in Positional Embeddings for Vision Transformers	Apr 19, 2025	BenchmarkingDiagnostic	—Unverified
Benchmarking machine learning models for quantum state classification	Sep 14, 2023	BenchmarkingClassification	—Unverified
Towards a Benchmark for Scientific Understanding in Humans and Machines	Apr 20, 2023	BenchmarkingInformation Retrieval	—Unverified
Benchmarking Machine Learning Methods for Distributed Acoustic Sensing	Mar 26, 2025	BenchmarkingData Augmentation	—Unverified
Benchmarking Machine Learning: How Fast Can Your Algorithms Go?	Jan 8, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym	Sep 29, 2023	Bayesian OptimizationBenchmarking	—Unverified
GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors	Jun 9, 2025	BenchmarkingModel extraction	—Unverified
Low-Density 3D Point Cloud Classification	Oct 30, 2024	3D Point Cloud ClassificationAutonomous Driving	—Unverified
Low Dynamic Range for RIS-aided Bistatic Integrated Sensing and Communication	Nov 9, 2024	BenchmarkingIntegrated sensing and communication	—Unverified
Low-resource Neural Machine Translation: Benchmarking State-of-the-art Transformer for Wolof<->French	Jun 1, 2022	BenchmarkingLow Resource Neural Machine Translation	—Unverified
LSTM-based Whisper Detection	Sep 20, 2018	Benchmarking	—Unverified
Benchmarking M6 Competitors: An Analysis of Financial Metrics and Discussion of Incentives	Jun 27, 2024	Benchmarking	—Unverified
LucidDreaming: Controllable Object-Centric 3D Generation	Nov 30, 2023	3D GenerationBenchmarking	—Unverified
Benchmarking LLMs on the Semantic Overlap Summarization Task	Feb 26, 2024	BenchmarkingDocument Summarization	—Unverified
LUND-PROBE -- LUND Prostate Radiotherapy Open Benchmarking and Evaluation dataset	Feb 6, 2025	BenchmarkingComputed Tomography (CT)	—Unverified
Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders	Mar 7, 2025	BenchmarkingClick-Through Rate Prediction	—Unverified
Towards a Human-Centred Cognitive Model of Visuospatial Complexity in Everyday Driving	May 29, 2020	Benchmarking	—Unverified
Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data	Sep 15, 2024	Benchmarkingtext annotation	—Unverified
M3Bench: Benchmarking Whole-body Motion Generation for Mobile Manipulation in 3D Scenes	Oct 9, 2024	BenchmarkingMotion Generation	—Unverified

Show:10 25 50

← PrevPage 69 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified