| Marathon: A Race Through the Realm of Long Context with Large Language Models | Dec 15, 2023 | Long-Context UnderstandingMultiple-choice | CodeCode Available | 1 |
| Self-Evaluation Improves Selective Generation in Large Language Models | Dec 14, 2023 | Multiple-choiceTruthfulQA | —Unverified | 0 |
| A Foundational Multimodal Vision Language AI Assistant for Human Pathology | Dec 13, 2023 | Decision MakingDiagnostic | —Unverified | 0 |
| Steering Llama 2 via Contrastive Activation Addition | Dec 9, 2023 | Multiple-choice | CodeCode Available | 2 |
| Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers | Dec 7, 2023 | MathMultiple-choice | CodeCode Available | 1 |
| A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education | Dec 5, 2023 | Multiple-choice | —Unverified | 0 |
| Unleashing the Potential of Large Language Model: Zero-shot VQA for Flood Disaster Scenario | Dec 4, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Explanatory Argument Extraction of Correct Answers in Resident Medical Exams | Dec 1, 2023 | Multiple-choice | CodeCode Available | 0 |
| Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension | Nov 30, 2023 | Multiple-choiceReading Comprehension | —Unverified | 0 |
| Biomedical knowledge graph-optimized prompt generation for large language models | Nov 29, 2023 | BenchmarkingKnowledge Graphs | CodeCode Available | 2 |