| MaterialBENCH: Evaluating College-Level Materials Science Problem-Solving Abilities of Large Language Models | Sep 5, 2024 | Multiple-choice | —Unverified | 0 |
| Math Multiple Choice Question Generation via Human-Large Language Model Collaboration | May 1, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| MCL-GAN: Generative Adversarial Networks with Multiple Specialized Discriminators | Jul 15, 2021 | Generative Adversarial NetworkMultiple-choice | —Unverified | 0 |
| MCQA-Eval: Efficient Confidence Evaluation in NLG with Gold-Standard Correctness Labels | Feb 20, 2025 | Multiple-choiceText Generation | —Unverified | 0 |
| MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation | May 13, 2024 | In-Context LearningMultiple-choice | —Unverified | 0 |
| Measuring Semantic Similarity by Latent Relational Analysis | Aug 10, 2005 | Multiple-choiceSemantic Similarity | —Unverified | 0 |
| MedGPT: Medical Concept Prediction from Clinical Narratives | Jul 7, 2021 | Multiple-choicenamed-entity-recognition | —Unverified | 0 |
| MedGUIDE: Benchmarking Clinical Decision-Making in Large Language Models | May 16, 2025 | BenchmarkingDecision Making | —Unverified | 0 |
| MeDiaQA: A Question Answering Dataset on Medical Dialogues | Aug 18, 2021 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway Encoding | Mar 11, 2024 | Dialogue GenerationMultiple-choice | —Unverified | 0 |