SOTAVerified

Benchmarking

Papers

Showing 32613270 of 5548 papers

TitleStatusHype
TIME: Temporal-sensitive Multi-dimensional Instruction Tuning and Benchmarking for Video-LLMs0
Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines0
Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning0
Understanding Large Language Models in Your Pockets: Performance Study on COTS Mobile Devices0
Benchmarking of LLM Detection: Comparing Two Competing Approaches0
Large Language Models are Null-Shot Learners0
Large Language Models are Few-Shot Clinical Information Extractors0
Large Language Models as Automated Aligners for benchmarking Vision-Language Models0
Benchmarking of Lightweight Deep Learning Architectures for Skin Cancer Classification using ISIC 2017 Dataset0
Adversarially Training for Audio Classifiers0
Show:102550
← PrevPage 327 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified