Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 526–550 of 1107 papers

Title	Date	Tasks	Status
Not All Options Are Created Equal: Textual Option Weighting for Token-Efficient LLM-Based Knowledge Tracing	Oct 14, 2024	AllBinary Classification	—Unverified
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models	Oct 13, 2024	Multiple-choice	—Unverified
LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models	Oct 13, 2024	HallucinationHallucination Evaluation	CodeCode Available
The Future of Learning in the Age of Generative AI: Automated Question Generation and Assessment with Large Language Models	Oct 12, 2024	MisconceptionsMultiple-choice	—Unverified
NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models	Oct 11, 2024	Multiple-choiceTruthfulQA	CodeCode Available
Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models	Oct 10, 2024	Conformal PredictionLanguage Modeling	—Unverified
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models	Oct 10, 2024	Multiple-choiceQuestion Answering	—Unverified
TVBench: Redesigning Video-Language Evaluation	Oct 10, 2024	Multiple-choiceOpen-Ended Question Answering	—Unverified
Answering Questions in Stages: Prompt Chaining for Contract QA	Oct 9, 2024	Multiple-choice	—Unverified
Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning	Oct 9, 2024	HallucinationMultiple-choice	CodeCode Available
ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition	Oct 8, 2024	Action RecognitionMultiple-choice	—Unverified
ACPBench: Reasoning about Action, Change, and Planning	Oct 8, 2024	Multiple-choice	—Unverified
Plausibly Problematic Questions in Multiple-Choice Benchmarks for Commonsense Reasoning	Oct 6, 2024	Multiple-choice	CodeCode Available
Listening to the Wise Few: Select-and-Copy Attention Heads for Multiple-Choice QA	Oct 3, 2024	Multiple-choiceQuestion Answering	—Unverified
Video Instruction Tuning With Synthetic Data	Oct 3, 2024	3D Question Answering (3D-QA)	—Unverified
DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models	Oct 2, 2024	Multiple-choiceparameter-efficient fine-tuning	CodeCode Available
Introducing Flexible Monotone Multiple Choice Item Response Theory Models and Bit Scales	Oct 2, 2024	Multiple-choice	CodeCode Available
Language Enhanced Model for Eye (LEME): An Open-Source Ophthalmology-Specific Large Language Model	Oct 1, 2024	AllLanguage Modeling	—Unverified
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling	Sep 30, 2024	Language ModelingLanguage Modelling	—Unverified
Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs	Sep 30, 2024	BenchmarkingMultiple-choice	—Unverified
Mitigating Selection Bias with Node Pruning and Auxiliary Options	Sep 27, 2024	Multiple-choiceSelection bias	—Unverified
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking	Sep 26, 2024	Distractor GenerationMultiple-choice	CodeCode Available
DARE: Diverse Visual Question Answering with Robustness Evaluation	Sep 26, 2024	image-classificationImage Classification	—Unverified
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ	Sep 25, 2024	ChatbotGSM8K	—Unverified
RISCORE: Enhancing In-Context Riddle Solving in Language Models through Context-Reconstructed Example Augmentation	Sep 24, 2024	Multiple-choiceSentence	—Unverified

Show:10 25 50

← PrevPage 22 of 45Next →

No leaderboard results yet.