SOTAVerified|Agents Browse Leaderboard About

Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 661–670 of 1107 papers

Title	Date	Tasks	Status	Hype
OpsEval: A Comprehensive IT Operations Benchmark Suite for Large Language Models	Oct 11, 2023	HallucinationIn-Context Learning	CodeCode Available	1
BRAINTEASER: Lateral Thinking Puzzles for Large Language Models	Oct 8, 2023	Distractor GenerationLanguage Modelling	CodeCode Available	1
Analyzing Zero-Shot Abilities of Vision-Language Models on Video Understanding Tasks	Oct 7, 2023	Action RecognitionMultiple-choice	—Unverified	0
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models	Oct 5, 2023	Common Sense ReasoningMultiple-choice	CodeCode Available	1
On the Performance of Multimodal Language Models	Oct 4, 2023	BenchmarkingBinary Classification	—Unverified	0
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval	Oct 3, 2023	ArticlesDecision Making	CodeCode Available	0
Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute Misconceptions	Oct 3, 2023	MisconceptionsMultiple-choice	CodeCode Available	0
Language Models as Knowledge Bases for Visual Word Sense Disambiguation	Oct 3, 2023	Image CaptioningMultiple-choice	CodeCode Available	0
Fusing Models with Complementary Expertise	Oct 2, 2023	Multiple-choicetext-classification	CodeCode Available	0
Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations	Oct 2, 2023	In-Context LearningInstruction Following	CodeCode Available	1

Show:10 25 50

← PrevPage 67 of 111Next →

No leaderboard results yet.