SOTAVerified

Benchmarking

Papers

Showing 26612670 of 5548 papers

TitleStatusHype
Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language ModelsCode1
SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration0
Efficiently Quantifying Individual Agent Importance in Cooperative MARL0
EventAid: Benchmarking Event-aided Image/Video Enhancement Algorithms with Real-captured Hybrid Dataset0
Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation0
Benchmarking Deep Learning Classifiers for SAR Automatic Target Recognition0
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary InvestigationCode1
Meta-survey on outlier and anomaly detectionCode0
Benchmarking Pretrained Vision Embeddings for Near- and Duplicate Detection in Medical Images0
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level PlanningCode1
Show:102550
← PrevPage 267 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified