SOTAVerified

Benchmarking

Papers

Showing 35813590 of 5548 papers

TitleStatusHype
Benchmarking Inference Performance of Deep Learning Models on Analog Devices0
MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models0
MHTS: Multi-Hop Tree Structure Framework for Generating Difficulty-Controllable QA Datasets for RAG Evaluation0
Benchmarking Individual Tree Mapping with Sub-meter Imagery0
Microtask crowdsourcing for disease mention annotation in PubMed abstracts0
Microvasculature Segmentation in Human BioMolecular Atlas Program (HuBMAP)0
Benchmarking Image Transformers for Prostate Cancer Detection from Ultrasound Data0
Benchmarking Image Sensors Under Adverse Weather Conditions for Autonomous Driving0
MileBench: Benchmarking MLLMs in Long Context0
Addressing the Real-world Class Imbalance Problem in Dermatology0
Show:102550
← PrevPage 359 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified