SOTAVerified

Multiple-choice

Papers

Showing 10311040 of 1107 papers

TitleStatusHype
Transliteration: A Simple Technique For Improving Multilingual Language Modeling0
True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-40
GRAF: Graph Retrieval Augmented by Facts for Romanian Legal Multi-Choice Question Answering0
GraphITE: Estimating Individual Effects of Graph-structured Treatments0
Graph-Structured Representations for Visual Question Answering0
Is There No Such Thing as a Bad Question? H4R: HalluciBot For Ratiocination, Rewriting, Ranking, and Routing0
Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation0
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems0
HardML: A Benchmark For Evaluating Data Science And Machine Learning knowledge and reasoning in AI0
HashEvict: A Pre-Attention KV Cache Eviction Strategy using Locality-Sensitive Hashing0
Show:102550
← PrevPage 104 of 111Next →

No leaderboard results yet.