| Training Optimus Prime, M.D.: Generating Medical Certification Items by Fine-Tuning OpenAI's gpt2 Transformer Model | Aug 23, 2019 | ArticlesLanguage Modeling | —Unverified | 0 | 0 |
| ForecastQA: A Question Answering Challenge for Event Forecasting with Temporal Text Data | May 2, 2020 | Knowledge GraphsLanguage Modelling | —Unverified | 0 | 0 |
| FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models | Apr 29, 2024 | Common Sense ReasoningMultiple-choice | —Unverified | 0 | 0 |
| Framing QA as Building and Ranking Intersentence Answer Justifications | Jun 1, 2017 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| From ChatGPT to DeepSeek AI: A Comprehensive Analysis of Evolution, Deviation, and Future Implications in AI-Language Models | Apr 4, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project | Sep 4, 2019 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT | May 17, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 | 0 |
| SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing Agents | Nov 12, 2024 | General KnowledgeHallucination | —Unverified | 0 | 0 |
| Fundamental Limitations in Defending LLM Finetuning APIs | Feb 20, 2025 | Multiple-choice | —Unverified | 0 | 0 |
| FusionMind -- Improving question and answering with external context fusion | Dec 31, 2023 | Knowledge GraphsMultiple-choice | —Unverified | 0 | 0 |