SOTAVerified

Benchmarking

Papers

Showing 12811290 of 5548 papers

TitleStatusHype
Benchmarking the Generation of Fact Checking ExplanationsCode1
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRACode1
A framework for benchmarking clustering algorithmsCode1
HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive MediaCode1
AirSim Drone Racing LabCode1
Benchmarking the Abilities of Large Language Models for RDF Knowledge Graph Creation and Comprehension: How Well Do LLMs Speak Turtle?Code1
Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantificationCode1
A Comprehensive Overview of Large Language ModelsCode1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge GraphsCode1
A framework for benchmarking class-out-of-distribution detection and its application to ImageNetCode1
Show:102550
← PrevPage 129 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified