| MedKP: Medical Dialogue with Knowledge Enhancement and Clinical Pathway Encoding | Mar 11, 2024 | Dialogue GenerationMultiple-choice | —Unverified | 0 | 0 |
| A Method for Building a Commonsense Inference Dataset based on Basic Events | Nov 1, 2020 | Multiple-choiceTransfer Learning | —Unverified | 0 | 0 |
| Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension | Feb 20, 2025 | Multiple-choiceReading Comprehension | —Unverified | 0 | 0 |
| Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning | Feb 27, 2025 | MathMedical Question Answering | —Unverified | 0 | 0 |
| AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning | May 16, 2024 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces | Mar 8, 2025 | Benchmarkingcounterfactual | —Unverified | 0 | 0 |
| AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models | Jun 13, 2024 | Multiple-choice | —Unverified | 0 | 0 |
| Meta Sequence Learning for Generating Adequate Question-Answer Pairs | Oct 4, 2020 | Multiple-choicenamed-entity-recognition | —Unverified | 0 | 0 |
| MHQA: A Diverse, Knowledge Intensive Mental Health Question Answering Challenge for Language Models | Feb 21, 2025 | BenchmarkingDiagnostic | —Unverified | 0 | 0 |
| MIBench: Evaluating Multimodal Large Language Models over Multiple Images | Jul 21, 2024 | In-Context LearningMultiple-choice | —Unverified | 0 | 0 |