SOTAVerified

Multiple-choice

Papers

Showing 801850 of 1107 papers

TitleStatusHype
AI-based Arabic Language and Speech Tutor0
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence0
VCEval: Rethinking What is a Good Educational Video and How to Automatically Evaluate It0
Modeling of Item-Difficulty for Ontology-based MCQs0
Monty Hall and Optimized Conformal Prediction to Improve Decision-Making with LLMs0
More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve Visually Diverse Images of Parsons Problems0
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering0
Mounting Video Metadata on Transformer-based Language Model for Open-ended Video Question Answering0
AI and Machine Learning for Next Generation Science Assessments0
AGReE: A system for generating Automated Grammar Reading Exercises0
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models0
MR. Judge: Multimodal Reasoner as a Judge0
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment0
MuLTI: Efficient Video-and-Language Understanding with Text-Guided MultiWay-Sampler and Multiple Choice Modeling0
Multilingual CALL Framework for Automatic Language Exercise Generation from Free Text0
Multi-Modal Retrieval Augmentation for Open-Ended and Knowledge-Intensive Video Question Answering0
Multiple Choice Learning for Efficient Speech Separation with Many Speakers0
Multiple Choice Learning: Learning to Produce Multiple Structured Outputs0
Multiple Choice Question Corpus Analysis for Distractor Characterization0
Multiple-Choice Question Generation: Towards an Automated Assessment Framework0
Multiple-Choice Question Generation Using Large Language Models: Methodology and Educator Insights0
Multiple Choice Question Generation Utilizing An Ontology0
Versatile Multiple Choice Learning and Its Application to Vision Computing0
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong0
Multi-source Meta Transfer for Low Resource Multiple-Choice Question Answering0
Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension0
A Graph-Guided Reasoning Approach for Open-ended Commonsense Question Answering0
MV-MATH: Evaluating Multimodal Math Reasoning in Multi-Visual Contexts0
My Answer Is NOT 'Fair': Mitigating Social Bias in Vision-Language Models via Fair and Biased Residuals0
Narrative Embedding: Re-Contextualization Through Attention0
VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks0
NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?0
AgMMU: A Comprehensive Agricultural Multimodal Understanding and Reasoning Benchmark0
Network-based Representations and Dynamic Discrete Choice Models for Multiple Discrete Choice Analysis0
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning0
VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation0
NEWSKVQA: Knowledge-Aware News Video Question Answering0
Video Instruction Tuning With Synthetic Data0
None of the Above, Less of the Right: Parallel Patterns between Humans and LLMs on Multi-Choice Questions Answering0
None of the Others: a General Technique to Distinguish Reasoning from Memorization in Multiple-Choice LLM Evaluation Benchmarks0
No Task Left Behind: Multi-Task Learning of Knowledge Tracing and Option Tracing for Better Student Assessment0
Note on Combinatorial Engineering Frameworks for Hierarchical Modular Systems0
Note on Evolution and Forecasting of Requirements: Communications Example0
Novel-WD: Exploring acquisition of Novel World Knowledge in LLMs Using Prefix-Tuning0
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models0
Objective quantification of mood states using large language models0
OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities0
OLMES: A Standard for Language Model Evaluations0
OmniEval: A Benchmark for Evaluating Omni-modal Models with Visual, Auditory, and Textual Inputs0
Online Joint Bid/Daily Budget Optimization of Internet Advertising Campaigns0
Show:102550
← PrevPage 17 of 23Next →

No leaderboard results yet.