SOTAVerified

Benchmarking

Papers

Showing 37513800 of 5548 papers

TitleStatusHype
Q2SAR: A Quantum Multiple Kernel Learning Approach for Drug Discovery0
Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs0
QDA^2: A principled approach to automatically annotating charge stability diagrams0
QHackBench: Benchmarking Large Language Models for Quantum Code Generation Using PennyLane Hackathon Challenges0
QPO: Query-dependent Prompt Optimization via Multi-Loop Offline Reinforcement Learning0
QSAM-Net: Rain streak removal by quaternion neural network with self-attention module0
Decoding Intelligence: A Framework for Certifying Knowledge Comprehension in LLMs0
QualBench: Benchmarking Chinese LLMs with Localized Professional Qualifications for Vertical Domain Evaluation0
Qualitative Insights Tool (QualIT): LLM Enhanced Topic Modeling0
Quality Assessment of Low Light Restored Images: A Subjective Study and an Unsupervised Model0
Quality Assured: Rethinking Annotation Strategies in Imaging AI0
Quality at the Tail of Machine Learning Inference0
QuantBench: Benchmarking AI Methods for Quantitative Investment0
Quantifying Social Biases Using Templates is Unreliable0
Quantifying the Complexity of Standard Benchmarking Datasets for Long-Term Human Trajectory Prediction0
Quantifying the Impact of Boundary Constraint Handling Methods on Differential Evolution0
Quantitative Benchmarking of Anomaly Detection Methods in Digital Pathology0
Quantitative evaluation of brain-inspired vision sensors in high-speed robotic perception0
Quantitative Metrics for Benchmarking Medical Image Harmonization0
Benchmarking Bayesian neural networks and evaluation metrics for regression tasks0
Quantum-Assisted Learning of Hardware-Embedded Probabilistic Graphical Models0
Quantum classification of the MNIST dataset with Slow Feature Analysis0
Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis0
Quantum Kernel Methods under Scrutiny: A Benchmarking Study0
Quantum Long Short-Term Memory (QLSTM) vs Classical LSTM in Time Series Forecasting: A Comparative Study in Solar Power Forecasting0
Quantum Kernel Learning for Small Dataset Modeling in Semiconductor Fabrication: Application to Ohmic Contact0
Quantum-tunnelling deep neural network for optical illusion recognition0
QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture0
R2C2-Coder: Enhancing and Benchmarking Real-world Repository-level Code Completion Abilities of Code Large Language Models0
R2H: Building Multimodal Navigation Helpers that Respond to Help Requests0
R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation0
R3L: Connecting Deep Reinforcement Learning to Recurrent Neural Networks for Image Denoising via Residual Recovery0
RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR0
RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems0
RAG-Reward: Optimizing RAG with Reward Modeling and RLHF0
Rail-5k: a Real-World Dataset for Rail Surface Defects Detection0
RAN-GNNs: breaking the capacity limits of graph neural networks0
Ransomware Detection Using Machine Learning in the Linux Kernel0
RayFronts: Open-Set Semantic Ray Frontiers for Online Scene Understanding and Exploration0
RBoard: A Unified Platform for Reproducible and Reusable Recommender System Benchmarks0
RCC-GAN: Regularized Compound Conditional GAN for Large-Scale Tabular Data Synthesis0
RDBench: ML Benchmark for Relational Databases0
RD-Suite: A Benchmark for Ranking Distillation0
Reactor Mk.1 performances: MMLU, HumanEval and BBH test results0
RealCause: Realistic Causal Inference Benchmarking0
Realistic Evaluation of Test-Time Adaptation Algorithms: Unsupervised Hyperparameter Selection0
Realistic Hair Simulation Using Image Blending0
Realistic Video Summarization through VISIOCITY: A New Benchmark and Evaluation Framework0
Real Time Egocentric Object Segmentation: THU-READ Labeling and Benchmarking Results0
Real-time Kinematic Ground Truth for the Oxford RobotCar Dataset0
Show:102550
← PrevPage 76 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified