Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2251–2300 of 5548 papers

Title	Date	Tasks	Status	Score
Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework	Feb 20, 2025	BenchmarkingQuestion Answering	CodeCode Available	5
Towards Segment Anything Model (SAM) for Medical Image Segmentation: A Survey	May 5, 2023	BenchmarkingImage Generation	CodeCode Available	5
How Far Are We from Optimal Reasoning Efficiency?	Jun 8, 2025	16kBenchmarking	CodeCode Available	5
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program	Apr 9, 2025	Benchmarking	CodeCode Available	5
A Seq2Seq approach to Symbolic Regression	Oct 17, 2020	Benchmarkingregression	CodeCode Available	5
How to Manage Tiny Machine Learning at Scale: An Industrial Perspective	Feb 18, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
Benchmarking Multilabel Topic Classification in the Kyrgyz Language	Aug 30, 2023	BenchmarkingClassification	CodeCode Available	5
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning	Jun 18, 2024	BenchmarkingWorld Knowledge	CodeCode Available	5
A Continuous Optimisation Benchmark Suite from Neural Network Regression	Sep 12, 2021	BenchmarkingEvolutionary Algorithms	CodeCode Available	5
Benchmarking multi-component signal processing methods in the time-frequency plane	Feb 13, 2024	BenchmarkingDenoising	CodeCode Available	5
HOEG: A New Approach for Object-Centric Predictive Process Monitoring	Apr 8, 2024	BenchmarkingGraph Neural Network	CodeCode Available	5
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios	Jun 11, 2025	Action RecognitionAction Segmentation	CodeCode Available	5
Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models	Jul 23, 2024	BenchmarkingSegmentation	CodeCode Available	5
Benchmarking MOEAs for solving continuous multi-objective RL problems	May 19, 2025	BenchmarkingEvolutionary Algorithms	CodeCode Available	5
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus	Oct 8, 2023	BenchmarkingMachine Translation	CodeCode Available	5
Benchmarking Model-Based Reinforcement Learning	Jul 3, 2019	Benchmarkingmodel	CodeCode Available	5
Benchmarking Misuse Mitigation Against Covert Adversaries	Jun 6, 2025	BenchmarkingLanguage Modeling	CodeCode Available	5
Benchmarking missing-values approaches for predictive models on health databases	Feb 17, 2022	AttributeBenchmarking	CodeCode Available	5
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition Datasets	Aug 1, 2024	BenchmarkingSimultaneous Localization and Mapping	CodeCode Available	5
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification	Apr 23, 2024	BenchmarkingHyperspectral Image Classification	CodeCode Available	5
Benchmarking Minimax Linkage	Jun 7, 2019	BenchmarkingClustering	CodeCode Available	5
HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability prediction	Jul 9, 2024	Benchmarking	CodeCode Available	5
EFSA: Towards Event-Level Financial Sentiment Analysis	Apr 8, 2024	ArticlesBenchmarking	CodeCode Available	5
Efficient, Uncertainty-based Moderation of Neural Networks Text Classifiers	Apr 4, 2022	Benchmarking	CodeCode Available	5
Harnessing Orthogonality to Train Low-Rank Neural Networks	Jan 16, 2024	Benchmarking	CodeCode Available	5
HATE-ITA: New Baselines for Hate Speech Detection in Italian	Jul 1, 2022	BenchmarkingHate Speech Detection	CodeCode Available	5
A Comparative Analysis of Word-Level Metric Differential Privacy: Benchmarking The Privacy-Utility Trade-off	Apr 4, 2024	Benchmarking	CodeCode Available	5
Efficient Performance Tracking: Leveraging Large Language Models for Automated Construction of Scientific Leaderboards	Sep 19, 2024	Benchmarking	CodeCode Available	5
Hardware Aware Neural Network Architectures using FbNet	Jun 17, 2019	BenchmarkingNeural Architecture Search	CodeCode Available	5
Heterogeneous Datasets for Federated Survival Analysis Simulation	Jan 28, 2023	BenchmarkingFederated Learning	CodeCode Available	5
Efficiently solving the thief orienteering problem with a max-min ant colony optimization approach	Sep 21, 2021	Benchmarking	CodeCode Available	5
Benchmarking Machine Learning Robustness in Covid-19 Genome Sequence Classification	Jul 18, 2022	BenchmarkingBIG-bench Machine Learning	CodeCode Available	5
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo	Mar 14, 2019	BenchmarkingOpenAI Gym	CodeCode Available	5
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios	Dec 21, 2024	Benchmarking	CodeCode Available	5
Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps	Jan 8, 2019	BenchmarkingCPU	CodeCode Available	5
Dynamic Neighborhood Construction for Structured Large Discrete Action Spaces	May 31, 2023	BenchmarkingRecommendation Systems	CodeCode Available	5
Grounded Intuition of GPT-Vision's Abilities with Scientific Images	Nov 3, 2023	Benchmarkingcounterfactual	CodeCode Available	5
Efficient and Effective Model Extraction	Sep 21, 2024	Benchmarkingmodel	CodeCode Available	5
Efficient and Accurate Optimal Transport with Mirror Descent and Conjugate Gradients	Jul 17, 2023	BenchmarkingGPU	CodeCode Available	5
Efficient Realistic Data Generation Framework leveraging Deep Learning-based Human Digitization	Jun 28, 2021	BenchmarkingDeep Learning	CodeCode Available	5
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs	Apr 7, 2025	BenchmarkingFairness	CodeCode Available	5
Harmonization Benchmarking Tool for Neuroimaging Datasets	Nov 15, 2022	BenchmarkingDiffusion MRI	CodeCode Available	5
Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared Control on the Hannes Prosthesis	Mar 18, 2022	BenchmarkingObject Recognition	CodeCode Available	5
Graph-theoretical approach to robust 3D normal extraction of LiDAR data	May 23, 2022	Benchmarking	CodeCode Available	5
GRATIS: GeneRAting TIme Series with diverse and controllable characteristics	Mar 7, 2019	BenchmarkingClustering	CodeCode Available	5
Grounding Synthetic Data Evaluations of Language Models in Unsupervised Document Corpora	May 13, 2025	BenchmarkingDiagnostic	CodeCode Available	5
Hard-Label Cryptanalytic Extraction of Neural Network Models	Sep 18, 2024	Benchmarking	CodeCode Available	5
Hi-EF: Benchmarking Emotion Forecasting in Human-interaction	Jul 23, 2024	Benchmarking	CodeCode Available	5
Learning Conjoint Attentions for Graph Neural Nets	Feb 5, 2021	BenchmarkingGraph Attention	CodeCode Available	5
Graph Convolutional Networks Meet with High Dimensionality Reduction	Nov 7, 2019	BenchmarkingDimensionality Reduction	CodeCode Available	5

Show:10 25 50

← PrevPage 46 of 111Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified