Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4801–4825 of 5548 papers

Title	Date	Tasks	Status
FR-MRInet: A Deep Convolutional Encoder-Decoder for Brain Tumor Segmentation with Relu-RGB and Sliding-window	Jul 26, 2018	BenchmarkingBrain Tumor Segmentation	CodeCode Available
AdamZ: An Enhanced Optimisation Method for Neural Network Training	Nov 22, 2024	Benchmarking	CodeCode Available
MLPerf Training Benchmark	Oct 2, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available
Theory-inspired Parameter Control Benchmarks for Dynamic Algorithm Configuration	Feb 7, 2022	BenchmarkingEvolutionary Algorithms	CodeCode Available
Benchmarking Spurious Bias in Few-Shot Image Classifiers	Sep 4, 2024	AttributeBenchmarking	CodeCode Available
FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering	May 27, 2025	BenchmarkingQuestion Answering	CodeCode Available
FORLORN: A Framework for Comparing Offline Methods and Reinforcement Learning for Optimization of RAN Parameters	Sep 8, 2022	Benchmarkingcontinuous-control	CodeCode Available
MMCoQA: Conversational Question Answering over Text, Tables, and Images	May 1, 2022	BenchmarkingConversational Question Answering	CodeCode Available
Forecasting time series with constraints	Feb 14, 2025	Additive modelsBenchmarking	CodeCode Available
Action-conditioned Benchmarking of Robotic Video Prediction Models: a Comparative Study	Oct 7, 2019	BenchmarkingPrediction	CodeCode Available
Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges	May 16, 2025	BenchmarkingState Estimation	CodeCode Available
Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling	Nov 21, 2024	ArticlesBenchmarking	CodeCode Available
Benchmarking Single Image Dehazing and Beyond	Dec 12, 2017	BenchmarkingImage Dehazing	CodeCode Available
VRKitchen2.0-IndoorKit: A Tutorial for Augmented Indoor Scene Building in Omniverse	Jun 23, 2022	BenchmarkingIndoor Scene Synthesis	CodeCode Available
One Law, Many Languages: Benchmarking Multilingual Legal Reasoning for Judicial Support	Jun 15, 2023	BenchmarkingInformation Retrieval	CodeCode Available
Forecasting Across Time Series Databases using Recurrent Neural Networks on Groups of Similar Series: A Clustering Approach	Oct 9, 2017	BenchmarkingClustering	CodeCode Available
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models	Aug 8, 2022	BenchmarkingState Space Models	CodeCode Available
Scaling and Benchmarking Self-Supervised Visual Representation Learning	May 3, 2019	Benchmarkingobject-detection	CodeCode Available
Scaling Compute Is Not All You Need for Adversarial Robustness	Dec 20, 2023	Adversarial RobustnessAll	CodeCode Available
Scaling Up Resonate-and-Fire Networks for Fast Deep Learning	Apr 1, 2025	BenchmarkingDeep Learning	CodeCode Available
Universal Music Representations? Evaluating Foundation Models on World Music Corpora	Jun 20, 2025	BenchmarkingFew-Shot Learning	CodeCode Available
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms	Feb 21, 2024	BenchmarkingHate Speech Detection	CodeCode Available
Fluorescence Reference Target Quantitative Analysis Library	Apr 22, 2025	Benchmarking	CodeCode Available
FLsim: A Modular and Library-Agnostic Simulation Framework for Federated Learning	Jul 15, 2025	BenchmarkingFederated Learning	CodeCode Available
FlowCyt: A Comparative Study of Deep Learning Approaches for Multi-Class Classification in Flow Cytometry Benchmarking	Feb 28, 2024	BenchmarkingInductive Learning	CodeCode Available

Show:10 25 50

← PrevPage 193 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified