Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2251–2275 of 5548 papers

Title	Date	Tasks	Status	Score
Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework	Feb 20, 2025	BenchmarkingQuestion Answering	CodeCode Available	5
HATE-ITA: New Baselines for Hate Speech Detection in Italian	Jul 1, 2022	BenchmarkingHate Speech Detection	CodeCode Available	5
HERMES: Holographic Equivariant neuRal network model for Mutational Effect and Stability prediction	Jul 9, 2024	Benchmarking	CodeCode Available	5
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program	Apr 9, 2025	Benchmarking	CodeCode Available	5
A Seq2Seq approach to Symbolic Regression	Oct 17, 2020	Benchmarkingregression	CodeCode Available	5
Harnessing Orthogonality to Train Low-Rank Neural Networks	Jan 16, 2024	Benchmarking	CodeCode Available	5
High-Quality, ROS Compatible Video Encoding and Decoding for High-Definition Datasets	Aug 1, 2024	BenchmarkingSimultaneous Localization and Mapping	CodeCode Available	5
Benchmarking Multilabel Topic Classification in the Kyrgyz Language	Aug 30, 2023	BenchmarkingClassification	CodeCode Available	5
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop Reasoning	Jun 18, 2024	BenchmarkingWorld Knowledge	CodeCode Available	5
A Continuous Optimisation Benchmark Suite from Neural Network Regression	Sep 12, 2021	BenchmarkingEvolutionary Algorithms	CodeCode Available	5
Hard-Label Cryptanalytic Extraction of Neural Network Models	Sep 18, 2024	Benchmarking	CodeCode Available	5
Benchmarking multi-component signal processing methods in the time-frequency plane	Feb 13, 2024	BenchmarkingDenoising	CodeCode Available	5
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios	Dec 21, 2024	Benchmarking	CodeCode Available	5
Dynamic Neighborhood Construction for Structured Large Discrete Action Spaces	May 31, 2023	BenchmarkingRecommendation Systems	CodeCode Available	5
Hardware Aware Neural Network Architectures using FbNet	Jun 17, 2019	BenchmarkingNeural Architecture Search	CodeCode Available	5
Aggregated Attributions for Explanatory Analysis of 3D Segmentation Models	Jul 23, 2024	BenchmarkingSegmentation	CodeCode Available	5
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo	Mar 14, 2019	BenchmarkingOpenAI Gym	CodeCode Available	5
Benchmarking MOEAs for solving continuous multi-objective RL problems	May 19, 2025	BenchmarkingEvolutionary Algorithms	CodeCode Available	5
Benchmarking Model-Based Reinforcement Learning	Jul 3, 2019	Benchmarkingmodel	CodeCode Available	5
Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps	Jan 8, 2019	BenchmarkingCPU	CodeCode Available	5
Benchmarking Misuse Mitigation Against Covert Adversaries	Jun 6, 2025	BenchmarkingLanguage Modeling	CodeCode Available	5
Benchmarking missing-values approaches for predictive models on health databases	Feb 17, 2022	AttributeBenchmarking	CodeCode Available	5
Harmonization Benchmarking Tool for Neuroimaging Datasets	Nov 15, 2022	BenchmarkingDiffusion MRI	CodeCode Available	5
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus	Oct 8, 2023	BenchmarkingMachine Translation	CodeCode Available	5
Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF Infeasible	Jul 10, 2025	Adversarial AttackBenchmarking	CodeCode Available	5

Show:10 25 50

← PrevPage 91 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified