| EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta | Dec 31, 2024 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| Establishing Task Scaling Laws via Compute-Efficient Model Ladders | Dec 5, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Aryl: An Elastic Cluster Scheduler for Deep Learning | Feb 16, 2022 | Deep LearningGPU | —Unverified | 0 |
| Clozer”:" Adaptable Data Augmentation for Cloze-style Reading Comprehension | May 1, 2022 | Data AugmentationMachine Reading Comprehension | —Unverified | 0 |
| Clozer: Adaptable Data Augmentation for Cloze-style Reading Comprehension | Mar 30, 2022 | Data AugmentationMachine Reading Comprehension | —Unverified | 0 |
| Amobee at SemEval-2019 Tasks 5 and 6: Multiple Choice CNN Over Contextual Embedding | Apr 17, 2019 | Multiple-choice | —Unverified | 0 |
| Enhancing Multiple-Choice Question Answering with Causal Knowledge | Jun 1, 2021 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering | Jan 2, 2025 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| A Method for Building a Commonsense Inference Dataset based on Basic Events | Nov 1, 2020 | Multiple-choiceTransfer Learning | —Unverified | 0 |
| ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases | May 30, 2025 | Medical Question AnsweringMultiple-choice | —Unverified | 0 |