SOTAVerified

Benchmarking

Papers

Showing 17011725 of 5548 papers

TitleStatusHype
Benchmarking Causal Study to Interpret Large Language Models for Source Code0
Benchmarking Burst Super-Resolution for Polarization Images: Noise Dataset and Analysis0
A new dataset of dog breed images and a benchmark for fine-grained classification0
Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment0
Benchmarking BioRelEx for Entity Tagging and Relation Extraction0
A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks0
Development details and computational benchmarking of DEPAM0
Benchmarking Biopharmaceuticals Retrieval-Augmented Generation Evaluation0
Benchmarking Biomedical Nested NER and Relation Extraction Models0
Deep Patent Landscaping Model Using Transformer and Graph Embedding0
Benchmarking Bias in Large Language Models during Role-Playing0
A New Approach for Image Authentication Framework for Media Forensics Purpose0
Abnormality-Driven Representation Learning for Radiology Imaging0
Device Modeling Bias in ReRAM-based Neural Network Simulations0
Benchmarking bias: Expanding clinical AI model card to incorporate bias reporting of social and non-social factors0
An Evolutionary Algorithm For the Vehicle Routing Problem with Drones with Interceptions0
Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection Tasks0
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation0
An evaluation framework for comparing causal inference models0
Benchmarking Azerbaijani Neural Machine Translation0
Benchmarking a wide range of optimisers for solving the Fermi-Hubbard model using the variational quantum eigensolver0
Benchmarking AutoML Frameworks for Disease Prediction Using Medical Claims0
A deep convolutional neural network model for rapid prediction of fluvial flood inundation0
Benchmarking Critical Questions Generation: A Challenging Reasoning Task for Large Language Models0
Dialogue Games for Benchmarking Language Understanding: Motivation, Taxonomy, Strategy0
Show:102550
← PrevPage 69 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified