SOTAVerified

Benchmarking

Papers

Showing 24762500 of 5548 papers

TitleStatusHype
Benchmarking SMT Performance for Farsi Using the TEP++ Corpus0
A Two-Step Framework for Multi-Material Decomposition of Dual Energy Computed Tomography from Projection Domain0
Benchmarking Smoothness and Reducing High-Frequency Oscillations in Continuous Control Policies0
A Two-Stage Neural-Filter Pareto Front Extractor and the need for Benchmarking0
Benchmarking Single-Image Reflection Removal Algorithms0
A tutorial on multi-view autoencoders using the multi-view-AE library0
Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking0
Benchmarking simulated and physical quantum processing units using quantum and hybrid algorithms0
A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking0
Benchmarking Shadow Removal for Facial Landmark Detection and Beyond0
A Large-scale Class-level Benchmark Dataset for Code Generation with LLMs0
Benchmarking Sensitivity of Continual Graph Learning for Skeleton-Based Action Recognition0
GenSpace: Benchmarking Spatially-Aware Image Generation0
A Large-Scale Analysis on Self-Supervised Video Representation Learning0
A Large-scale Benchmark on Geological Fault Delineation Models: Domain Shift, Training Dynamics, Generalizability, Evaluation and Inferential Behavior0
On the Evaluation of Engineering Artificial General Intelligence0
Genicious: Contextual Few-shot Prompting for Insights Discovery0
GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks0
Benchmarking Scientific Image Forgery Detectors0
Benchmarking Scene Text Recognition in Devanagari, Telugu and Malayalam0
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning0
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking0
Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models0
GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models0
GeoGebra Tools with Proof Capabilities0
Show:102550
← PrevPage 100 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified