| Can Model Uncertainty Function as a Proxy for Multiple-Choice Question Item Difficulty? | Jul 7, 2024 | Multiple-choice | CodeCode Available | 0 | 5 |
| A quantitative study of NLP approaches to question difficulty estimation | May 17, 2023 | MathMultiple-choice | CodeCode Available | 0 | 5 |
| Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute Misconceptions | Oct 3, 2023 | MisconceptionsMultiple-choice | CodeCode Available | 0 | 5 |
| A Joint Sequence Fusion Model for Video Question Answering and Retrieval | Aug 7, 2018 | DecoderMultiple-choice | CodeCode Available | 0 | 5 |
| Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings | Jan 15, 2024 | Knowledge Graph EmbeddingsKnowledge Graphs | CodeCode Available | 0 | 5 |
| LEAVS: An LLM-based Labeler for Abdominal CT Supervision | Mar 17, 2025 | AnatomyLarge Language Model | CodeCode Available | 0 | 5 |
| Leaving the barn door open for Clever Hans: Simple features predict LLM benchmark answers | Oct 15, 2024 | Multiple-choice | CodeCode Available | 0 | 5 |
| Length Optimization in Conformal Prediction | Jun 27, 2024 | Conformal PredictionLanguage Modeling | CodeCode Available | 0 | 5 |
| From Multiple-Choice to Extractive QA: A Case Study for English and Arabic | Apr 26, 2024 | BelebeleExtractive Question-Answering | CodeCode Available | 0 | 5 |
| AILS-NTUA at SemEval-2024 Task 9: Cracking Brain Teasers: Transformer Models for Lateral Thinking Puzzles | Apr 1, 2024 | Common Sense ReasoningMultiple-choice | CodeCode Available | 0 | 5 |