| Chain-of-Exemplar: Enhancing Distractor Generation for Multimodal Educational Question Generation | Aug 16, 2024 | Distractor GenerationMultiple-choice | CodeCode Available | 0 |
| LLMs Are Biased Towards Output Formats! Systematically Evaluating and Mitigating Output Format Bias of LLMs | Aug 16, 2024 | Instruction FollowingMultiple-choice | CodeCode Available | 1 |
| Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil | Aug 9, 2024 | MathMultiple-choice | —Unverified | 0 |
| LLaVA-OneVision: Easy Visual Task Transfer | Aug 6, 2024 | 3D Question Answering (3D-QA) | CodeCode Available | 0 |
| Winning Amazon KDD Cup'24 | Aug 5, 2024 | Data AugmentationMultiple-choice | —Unverified | 0 |
| XMainframe: A Large Language Model for Mainframe Modernization | Aug 5, 2024 | Code SummarizationLanguage Modeling | CodeCode Available | 2 |
| MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models | Aug 5, 2024 | Image ComprehensionMultiple-choice | CodeCode Available | 2 |
| Recent Advances in Multi-Choice Machine Reading Comprehension: A Survey on Methods and Datasets | Aug 4, 2024 | Few-Shot LearningMachine Reading Comprehension | —Unverified | 0 |
| MiniCPM-V: A GPT-4V Level MLLM on Your Phone | Aug 3, 2024 | HallucinationMultiple-choice | CodeCode Available | 12 |
| MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models | Aug 2, 2024 | Multimodal ReasoningMultiple-choice | CodeCode Available | 3 |