Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–275 of 1107 papers

Title	Date	Tasks	Status	Hype
HashEvict: A Pre-Attention KV Cache Eviction Strategy using Locality-Sensitive Hashing	Dec 13, 2024	GPUMultiple-choice	—Unverified	0
A multimodal dataset for understanding the impact of mobile phones on remote online virtual education	Dec 13, 2024	EEGHead Pose Estimation	CodeCode Available	0
LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering	Dec 13, 2024	Few-Shot LearningKnowledge Distillation	—Unverified	0
Does Multiple Choice Have a Future in the Age of Generative AI? A Posttest-only RCT	Dec 13, 2024	Multiple-choice	CodeCode Available	0
Neptune: The Long Orbit to Benchmarking Long Video Understanding	Dec 12, 2024	BenchmarkingMultimodal Reasoning	CodeCode Available	2
Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph Completion	Dec 12, 2024	HallucinationKnowledge Graph Completion	CodeCode Available	1
MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Models	Dec 10, 2024	Multiple-choiceQuestion Answering	CodeCode Available	0
Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings	Dec 9, 2024	Multiple-choice	CodeCode Available	0
ACQ: A Unified Framework for Automated Programmatic Creativity in Online Advertising	Dec 9, 2024	Multiple-choiceMulti-Task Learning	—Unverified	0
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor	Dec 8, 2024	MisconceptionsMultiple-choice	CodeCode Available	0
MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects	Dec 6, 2024	2kAnomaly Detection	—Unverified	0
Establishing Task Scaling Laws via Compute-Efficient Model Ladders	Dec 5, 2024	Language ModelingLanguage Modelling	—Unverified	0
GRAF: Graph Retrieval Augmented by Facts for Romanian Legal Multi-Choice Question Answering	Dec 5, 2024	Information RetrievalMultiple-choice	—Unverified	0
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?	Dec 3, 2024	Multiple-choice	CodeCode Available	1
SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages	Dec 2, 2024	Multiple-choice	CodeCode Available	1
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models	Dec 2, 2024	MMLUMultiple-choice	CodeCode Available	0
The use of large language models to enhance cancer clinical trial educational materials	Dec 2, 2024	MisinformationMultiple-choice	—Unverified	0
Unlocking Video-LLM via Agent-of-Thoughts Distillation	Dec 2, 2024	Language ModelingLanguage Modelling	—Unverified	0
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages	Dec 1, 2024	ARCMultiple-choice	—Unverified	0
VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information	Dec 1, 2024	Multiple-choice	CodeCode Available	1
KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting	Dec 1, 2024	Multiple-choiceMultiple Choice Question Answering (MCQA)	CodeCode Available	0
Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments	Nov 30, 2024	Multiple-choice	—Unverified	0
Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark	Nov 29, 2024	BenchmarkingGrounded Video Question Answering	—Unverified	0
Applying IRT to Distinguish Between Human and Generative AI Responses to Multiple-Choice Assessments	Nov 28, 2024	Multiple-choice	—Unverified	0
Sparse Attention Vectors: Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers	Nov 28, 2024	Image Captioningimage-classification	—Unverified	0

Show:10 25 50

← PrevPage 11 of 45Next →

No leaderboard results yet.