| Fill-in-the-Blank: A Challenging Video Understanding Evaluation Framework | Nov 16, 2021 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| Fine-tuning BERT with Focus Words for Explanation Regeneration | Dec 1, 2020 | Explanation GenerationMultiple-choice | —Unverified | 0 |
| An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models | Sep 5, 2023 | Multiple-choice | —Unverified | 0 |
| An Automated Multiple-Choice Question Generation Using Natural Language Processing Techniques | Mar 26, 2021 | Multiple-choiceQuestion Generation | —Unverified | 0 |
| First Place Solution to the Multiple-choice Video QA Track of The Second Perception Test Challenge | Sep 20, 2024 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| First Token Probability Guided RAG for Telecom Question Answering | Jan 11, 2025 | Multiple-choiceMultiple Choice Question Answering (MCQA) | —Unverified | 0 |
| An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering | May 25, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above | Feb 19, 2025 | AllMultiple-choice | —Unverified | 0 |
| Training Optimus Prime, M.D.: Generating Medical Certification Items by Fine-Tuning OpenAI's gpt2 Transformer Model | Aug 23, 2019 | ArticlesLanguage Modeling | —Unverified | 0 |
| ForecastQA: A Question Answering Challenge for Event Forecasting with Temporal Text Data | May 2, 2020 | Knowledge GraphsLanguage Modelling | —Unverified | 0 |