| Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration | Jun 24, 2024 | DiversityMultiple-choice | —Unverified | 0 | 0 |
| Evaluation of Automatically Generated Pronoun Reference Questions | Sep 1, 2017 | Multiple-choiceReading Comprehension | —Unverified | 0 | 0 |
| Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil | Aug 9, 2024 | MathMultiple-choice | —Unverified | 0 | 0 |
| Towards Geo-Culturally Grounded LLM Generations | Feb 19, 2025 | Multiple-choiceRetrieval-augmented Generation | —Unverified | 0 | 0 |
| Towards Integrated Glance To Restructuring in Combinatorial Optimization | Dec 20, 2015 | ClusteringCombinatorial Optimization | —Unverified | 0 | 0 |
| ExplanationLP: Abductive Reasoning for Explainable Science Question Answering | Oct 25, 2020 | Answer SelectionARC | —Unverified | 0 | 0 |
| Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization | Oct 13, 2021 | Multiple-choiceQuantization | —Unverified | 0 | 0 |
| Explore then Determine: A GNN-LLM Synergy Framework for Reasoning over Knowledge Graph | Jun 3, 2024 | Knowledge GraphsMultiple-choice | —Unverified | 0 | 0 |
| Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement | Sep 10, 2024 | Multiple-choiceSentence | —Unverified | 0 | 0 |
| Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications | May 19, 2024 | Multiple-choice | —Unverified | 0 | 0 |