| A Theoretically Grounded Benchmark for Evaluating Machine Commonsense | Mar 23, 2022 | Generative Question AnsweringMultiple-choice | —Unverified | 0 | 0 |
| Attribution analysis of legal language as used by LLM | Jan 28, 2025 | Binary ClassificationMultiple-choice | —Unverified | 0 | 0 |
| Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL) | Dec 16, 2024 | Imitation LearningMultiple-choice | —Unverified | 0 | 0 |
| AutoDrive-QA- Automated Generation of Multiple-Choice Questions for Autonomous Driving Datasets Using Large Vision-Language Models | Mar 20, 2025 | Autonomous DrivingMultiple-choice | —Unverified | 0 | 0 |
| Auto-Evaluation: A Critical Measure in Driving Improvements in Quality and Safety of AI-Generated Lesson Resources | Jan 23, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| What Large Language Models Know and What People Think They Know | Jan 24, 2024 | ArticlesDecision Making | —Unverified | 0 | 0 |
| Automated Answer Validation using Text Similarity | Jan 13, 2024 | Information RetrievalMultiple-choice | —Unverified | 0 | 0 |
| Answering Chinese Elementary School Social Study Multiple Choice Questions | Jun 26, 2021 | Multiple-choiceNegation | —Unverified | 0 | 0 |
| The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think | May 15, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| Automated Generation of Multiple-Choice Cloze Questions for Assessing English Vocabulary Using GPT-turbo 3.5 | Mar 4, 2024 | Multiple-choicePart-Of-Speech Tagging | —Unverified | 0 | 0 |