Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3201–3225 of 5548 papers

Title	Date	Tasks	Status
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future	Aug 5, 2024	BenchmarkingCode Generation	—Unverified
From Precision to Perception: User-Centred Evaluation of Keyword Extraction Algorithms for Internet-Scale Contextual Advertising	Apr 30, 2025	BenchmarkingComputational Efficiency	—Unverified
From Private to Public: Benchmarking GANs in the Context of Private Time Series Classification	Mar 28, 2023	BenchmarkingPrivacy Preserving	—Unverified
From Protoscience to Epistemic Monoculture: How Benchmarking Set the Stage for the Deep Learning Revolution	Apr 9, 2024	Benchmarking	—Unverified
From Sound Representation to Model Robustness	Jul 27, 2020	Adversarial AttackAdversarial Robustness	—Unverified
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems	Jun 5, 2025	BenchmarkingRAG	—Unverified
From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference	Oct 4, 2023	BenchmarkingGPU	—Unverified
FSD-10: A Dataset for Competitive Sports Content Analysis	Feb 9, 2020	Action RecognitionBenchmarking	—Unverified
Full-scale modal testing of a Hawk T1A aircraft for benchmarking vibration-based methods	Oct 6, 2023	BenchmarkingExperimental Design	—Unverified
Full-stack evaluation of Machine Learning inference workloads for RISC-V systems	May 24, 2024	BenchmarkingDeep Learning	—Unverified
FunBench: Benchmarking Fundus Reading Skills of MLLMs	Mar 2, 2025	AnatomyBenchmarking	—Unverified
Functional Code Building Genetic Programming	Jun 9, 2022	BenchmarkingProgram Synthesis	—Unverified
Efficient Pauli channel estimation with logarithmic quantum memory	Sep 25, 2023	Benchmarking	—Unverified
FuzzWiz -- Fuzzing Framework for Efficient Hardware Coverage	Oct 23, 2024	Benchmarking	—Unverified
Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK	Feb 16, 2023	BenchmarkingKnowledge Distillation	—Unverified
Genetic algorithm for feature selection of EEG heterogeneous data	Mar 12, 2021	BenchmarkingEEG	—Unverified
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training	Apr 30, 2025	Benchmarking	—Unverified
GAN-based disentanglement learning for chest X-ray rib suppression	Oct 18, 2021	BenchmarkingComputed Tomography (CT)	—Unverified
GANmut: Generating and Modifying Facial Expressions	Jun 16, 2024	BenchmarkingDiversity	—Unverified
GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR	Apr 15, 2025	Benchmarking	—Unverified
GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics	Mar 27, 2025	BenchmarkingNatural Language Queries	—Unverified
Gauss-Ramanujan Functions: Constructions, Properties, and Applications in Communications and Signal Processing	May 27, 2025	Benchmarking	—Unverified
GenderBias-VL: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing	Jun 30, 2024	Benchmarkingcounterfactual	—Unverified
GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases	May 25, 2024	BenchmarkingHallucination	—Unverified
Generalised Gaussian Process Latent Variable Models (GPLVM) with Stochastic Variational Inference	Feb 25, 2022	BenchmarkingDimensionality Reduction	—Unverified

Show:10 25 50

← PrevPage 129 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified