SOTAVerified

Benchmarking

Papers

Showing 37763800 of 5548 papers

TitleStatusHype
Quantum Kernel Learning for Small Dataset Modeling in Semiconductor Fabrication: Application to Ohmic Contact0
Quantum-tunnelling deep neural network for optical illusion recognition0
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture0
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models0
R2H: Building Multimodal Navigation Helpers that Respond to Help Requests0
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation0
R3L: Connecting Deep Reinforcement Learning to Recurrent Neural Networks for Image Denoising via Residual Recovery0
RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR0
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems0
RAG-Reward: Optimizing RAG with Reward Modeling and RLHF0
Rail-5k: a Real-World Dataset for Rail Surface Defects Detection0
RAN-GNNs: breaking the capacity limits of graph neural networks0
Ransomware Detection Using Machine Learning in the Linux Kernel0
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration0
RBoard: A Unified Platform for Reproducible and Reusable Recommender System Benchmarks0
RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis0
RDBench: ML Benchmark for Relational Databases0
RD-Suite: A Benchmark for Ranking Distillation0
Reactor Mk.1 performances: MMLU, HumanEval and BBH test results0
RealCause: Realistic Causal Inference Benchmarking0
Realistic Evaluation of Test-Time Adaptation Algorithms: Unsupervised Hyperparameter Selection0
Realistic Hair Simulation Using Image Blending0
Realistic Video Summarization through VISIOCITY: A New Benchmark and Evaluation Framework0
Real Time Egocentric Object Segmentation: THU-READ Labeling and Benchmarking Results0
Real-time Kinematic Ground Truth for the Oxford RobotCar Dataset0
Show:102550
← PrevPage 152 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified