Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3351–3400 of 5548 papers

Title	Date	Tasks	Status
Leveraging Pre-trained AudioLDM for Sound Generation: A Benchmark Study	Mar 7, 2023	Audio GenerationBenchmarking	—Unverified
Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks	Mar 18, 2024	BenchmarkingClassification	—Unverified
Leveraging State Space Models in Long Range Genomics	Apr 7, 2025	BenchmarkingGPU	—Unverified
Break a Lag: Triple Exponential Moving Average for Enhanced Optimization	Jun 2, 2023	Benchmarkingimage-classification	—Unverified
LEXam: Benchmarking Legal Reasoning on 340 Law Exams	May 19, 2025	BenchmarkingLegal Reasoning	—Unverified
Benchmarking network fabrics for data distributed training of deep neural networks	Aug 18, 2020	BenchmarkingBIG-bench Machine Learning	—Unverified
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing	Jun 11, 2024	BenchmarkingStance Detection	—Unverified
Benchmarking Named Entity Disambiguation approaches for Streaming Graphs	Jul 14, 2014	BenchmarkingEntity Disambiguation	—Unverified
Benchmarking Mutual Information-based Loss Functions in Federated Learning	Apr 16, 2025	BenchmarkingFairness	—Unverified
Benchmarking Music Generation Models and Metrics via Human Preference Studies	Jun 23, 2025	BenchmarkingMusic Generation	—Unverified
Top-k Regularization for Supervised Feature Selection	Jun 4, 2021	Benchmarkingfeature selection	—Unverified
LIBRE: The Multiple 3D LiDAR Dataset	Mar 13, 2020	Benchmarking	—Unverified
LidarGait: Benchmarking 3D Gait Recognition with Point Clouds	Nov 19, 2022	3D geometryBenchmarking	—Unverified
Lifelogging As An Extreme Form of Personal Information Management -- What Lessons To Learn	Jan 11, 2024	BenchmarkingForm	—Unverified
Benchmarking Multivariate Time Series Classification Algorithms	Jul 26, 2020	BenchmarkingClassification	—Unverified
Light Field Image Quality Assessment With Auxiliary Learning Based on Depthwise and Anglewise Separable Convolutions	Dec 10, 2024	Auxiliary LearningBenchmarking	—Unverified
Advances in Preference-based Reinforcement Learning: A Review	Aug 21, 2024	Benchmarkingreinforcement-learning	—Unverified
Benchmarking Multi-Organ Segmentation Tools for Multi-Parametric T1-weighted Abdominal MRI	Apr 10, 2025	BenchmarkingOrgan Segmentation	—Unverified
Lightly Weighted Automatic Audio Parameter Extraction for the Quality Assessment of Consensus Auditory-Perceptual Evaluation of Voice	Nov 27, 2023	Benchmarking	—Unverified
Lightning UQ Box: A Comprehensive Framework for Uncertainty Quantification in Deep Learning	Oct 4, 2024	BenchmarkingUncertainty Quantification	—Unverified
Lightweight Jet Reconstruction and Identification as an Object Detection Task	Feb 9, 2022	Benchmarkingobject-detection	—Unverified
Solving excited states for long-range interacting trapped ions with neural networks	Jun 10, 2025	Benchmarking	—Unverified
Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection	Aug 23, 2024	BenchmarkingBinary Classification	—Unverified
Benchmarking Multi-National Value Alignment for Large Language Models	Apr 17, 2025	Benchmarking	—Unverified
LIM: Large Interpolator Model for Dynamic Reconstruction	Mar 28, 2025	4D reconstructionBenchmarking	—Unverified
Advanced Manufacturing Configuration by Sample-efficient Batch Bayesian Optimization	May 24, 2022	Bayesian OptimizationBenchmarking	—Unverified
Line Goes Up? Inherent Limitations of Benchmarks for Evaluating Large Language Models	Feb 20, 2025	Benchmarking	—Unverified
Liquid State Genetic Programming	Dec 5, 2023	Benchmarking	—Unverified
Livestock Monitoring with Transformer	Nov 1, 2021	Action RecognitionBenchmarking	—Unverified
Benchmarking Multimodal Sentiment Analysis	Jul 29, 2017	BenchmarkingEmotion Recognition	—Unverified
LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education	Feb 9, 2024	BenchmarkingChatbot	—Unverified
LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living	Jun 13, 2024	BenchmarkingHuman-Object Interaction Detection	—Unverified
LLM4DV: Using Large Language Models for Hardware Test Stimuli Generation	Oct 6, 2023	BenchmarkingMathematical Reasoning	—Unverified
Benchmarking Multimodal Regex Synthesis with Complex Structures	May 2, 2020	Benchmarking	—Unverified
LLM-based Evaluation Policy Extraction for Ecological Modeling	May 20, 2025	BenchmarkingLarge Language Model	—Unverified
A War Beyond Deepfake: Benchmarking Facial Counterfeits and Countermeasures	Nov 25, 2021	BenchmarkingDeepFake Detection	—Unverified
Benchmarking Multimodal Models for Ukrainian Language Understanding Across Academic and Cultural Domains	Nov 22, 2024	BenchmarkingCaption Generation	—Unverified
A Distance Oriented Kalman Filter Particle Swarm Optimizer Applied to Multi-Modality Image Registration	Mar 20, 2018	BenchmarkingImage Registration	—Unverified
Benchmarking Multimodal Models for Fine-Grained Image Analysis: A Comparative Study Across Diverse Visual Features	Jan 14, 2025	Benchmarking	—Unverified
LLM Evaluators Recognize and Favor Their Own Generations	Apr 15, 2024	Benchmarking	—Unverified
Benchmarking Multimodal LLMs on Recognition and Understanding over Chemical Tables	Jun 13, 2025	BenchmarkingDescriptive	—Unverified
Benchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015	May 1, 2016	Benchmarking	—Unverified
LLM-initialized Differentiable Causal Discovery	Oct 28, 2024	BenchmarkingCausal Discovery	—Unverified
Totally Corrective Boosting with Cardinality Penalization	Apr 7, 2015	BenchmarkingCombinatorial Optimization	—Unverified
Benchmarking Multi-Domain Active Learning on Image Classification	Dec 1, 2023	Active LearningAll	—Unverified
LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation	Feb 18, 2025	BenchmarkingText Generation	—Unverified
LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study	Sep 13, 2024	BenchmarkingGrapheme-to-Phoneme Conversion	—Unverified
Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming	Dec 21, 2023	Benchmarkingreinforcement-learning	—Unverified
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms	Jan 1, 2021	BenchmarkingDeep Reinforcement Learning	—Unverified
LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection	Oct 29, 2023	BenchmarkingDiversity	—Unverified

Show:10 25 50

← PrevPage 68 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified