SOTAVerified

Multiple-choice

Papers

Showing 91100 of 1107 papers

TitleStatusHype
FaceXBench: Evaluating Multimodal LLMs on Face UnderstandingCode1
Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph CompletionCode1
Fool Your (Vision and) Language Model With Embarrassingly Simple PermutationsCode1
HCQA @ Ego4D EgoSchema Challenge 2024Code1
AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading ComprehensionCode1
Boosting Healthcare LLMs Through Retrieved ContextCode1
Evaluating language models as risk scoresCode1
Explicit Planning Helps Language Models in Logical ReasoningCode1
GIE-Bench: Towards Grounded Evaluation for Text-Guided Image EditingCode1
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and LanguagesCode1
Show:102550
← PrevPage 10 of 111Next →

No leaderboard results yet.