Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3776–3800 of 5548 papers

Title	Date	Tasks	Status
Quantum Kernel Learning for Small Dataset Modeling in Semiconductor Fabrication: Application to Ohmic Contact	Sep 17, 2024	BenchmarkingQuantum Machine Learning	—Unverified
Quantum-tunnelling deep neural network for optical illusion recognition	Jun 26, 2024	Autonomous VehiclesBenchmarking	—Unverified
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture	Jan 3, 2025	BenchmarkingQuestion Answering	—Unverified
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models	Jun 3, 2024	BenchmarkingCode Completion	—Unverified
R2H: Building Multimodal Navigation Helpers that Respond to Help Requests	May 23, 2023	BenchmarkingLanguage Modeling	—Unverified
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation	May 29, 2025	BenchmarkingImage Generation	—Unverified
R3L: Connecting Deep Reinforcement Learning to Recurrent Neural Networks for Image Denoising via Residual Recovery	Jul 12, 2021	BenchmarkingDeep Reinforcement Learning	—Unverified
RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR	Nov 23, 2021	BenchmarkingComputed Tomography (CT)	—Unverified
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems	Jun 25, 2024	BenchmarkingRAG	—Unverified
RAG-Reward: Optimizing RAG with Reward Modeling and RLHF	Jan 22, 2025	BenchmarkingHallucination	—Unverified
Rail-5k: a Real-World Dataset for Rail Surface Defects Detection	Jun 28, 2021	4kBenchmarking	—Unverified
RAN-GNNs: breaking the capacity limits of graph neural networks	Mar 29, 2021	AttributeBenchmarking	—Unverified
Ransomware Detection Using Machine Learning in the Linux Kernel	Sep 10, 2024	Benchmarking	—Unverified
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration	Apr 9, 2025	3D Semantic SegmentationBenchmarking	—Unverified
RBoard: A Unified Platform for Reproducible and Reusable Recommender System Benchmarks	Sep 9, 2024	BenchmarkingClick-Through Rate Prediction	—Unverified
RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis	May 24, 2022	BenchmarkingGenerative Adversarial Network	—Unverified
RDBench: ML Benchmark for Relational Databases	Oct 25, 2023	Benchmarking	—Unverified
RD-Suite: A Benchmark for Ranking Distillation	Jun 7, 2023	Benchmarking	—Unverified
Reactor Mk.1 performances: MMLU, HumanEval and BBH test results	Jun 15, 2024	BenchmarkingHumanEval	—Unverified
RealCause: Realistic Causal Inference Benchmarking	Nov 30, 2020	BenchmarkingCausal Inference	—Unverified
Realistic Evaluation of Test-Time Adaptation Algorithms: Unsupervised Hyperparameter Selection	Jul 19, 2024	BenchmarkingModel Selection	—Unverified
Realistic Hair Simulation Using Image Blending	Apr 19, 2019	BenchmarkingData Augmentation	—Unverified
Realistic Video Summarization through VISIOCITY: A New Benchmark and Evaluation Framework	Jul 29, 2020	BenchmarkingVideo Summarization	—Unverified
Real Time Egocentric Object Segmentation: THU-READ Labeling and Benchmarking Results	Jun 9, 2021	BenchmarkingMixed Reality	—Unverified
Real-time Kinematic Ground Truth for the Oxford RobotCar Dataset	Feb 24, 2020	Benchmarking	—Unverified

Show:10 25 50

← PrevPage 152 of 222Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified