SOTAVerified

Multiple-choice

Papers

Showing 9761000 of 1107 papers

TitleStatusHype
ExplanationLP: Abductive Reasoning for Explainable Science Question Answering0
Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization0
Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph0
Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement0
Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications0
Exploring the Comprehension of ChatGPT in Traditional Chinese Medicine Knowledge0
How Additional Knowledge can Improve Natural Language Commonsense Question Answering?0
Exposing the Limits of Video-Text Models through Contrast Sets0
Towards Multilingual LLM Evaluation for Baltic and Nordic languages: A study on Lithuanian History0
FactTest: Factuality Testing in Large Language Models with Finite-Sample and Distribution-Free Guarantees0
Towards Multistage Design of Modular Systems0
FAMULUS: Interactive Annotation and Feedback Generation for Teaching Diagnostic Reasoning0
FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models0
Town Hall Debate Prompting: Enhancing Logical Reasoning in LLMs through Multi-Persona Interaction0
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding0
Few-Shot Image Classification and Segmentation as Visual Question Answering Using Vision-Language Models0
Field-testing items using artificial intelligence: Natural language processing with transformers0
Fill-in-the-Blank: A Challenging Video Understanding Evaluation Framework0
Fine-tuning BERT with Focus Words for Explanation Regeneration0
An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models0
An Automated Multiple-Choice Question Generation Using Natural Language Processing Techniques0
First Place Solution to the Multiple-choice Video QA Track of The Second Perception Test Challenge0
First Token Probability Guided RAG for Telecom Question Answering0
An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering0
Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above0
Show:102550
← PrevPage 40 of 45Next →

No leaderboard results yet.