| Length Optimization in Conformal Prediction | Jun 27, 2024 | Conformal PredictionLanguage Modeling | CodeCode Available | 0 |
| FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework | Apr 9, 2021 | Language ModellingMultiple-choice | CodeCode Available | 0 |
| Training-free LLM Merging for Multi-task Learning | Jun 14, 2025 | Multiple-choiceMulti-Task Learning | CodeCode Available | 0 |
| Solving and Generating NPR Sunday Puzzles with Large Language Models | Jun 21, 2023 | Multiple-choicePrompt Engineering | CodeCode Available | 0 |
| HSI: Head-Specific Intervention Can Induce Misaligned AI Coordination in Large Language Models | Feb 9, 2025 | Answer GenerationLanguage Modeling | CodeCode Available | 0 |
| Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings | Jan 15, 2024 | Knowledge Graph EmbeddingsKnowledge Graphs | CodeCode Available | 0 |
| UnibucLLM: Harnessing LLMs for Automated Prediction of Item Difficulty and Response Time for Multiple-Choice Questions | Apr 20, 2024 | Data AugmentationMultiple-choice | CodeCode Available | 0 |
| Solving Machine Learning Problems | Jul 2, 2021 | BIG-bench Machine LearningData Augmentation | CodeCode Available | 0 |
| Leveraging large language models for nano synthesis mechanism explanation: solid foundations or mere conjectures? | Jul 12, 2024 | Logical ReasoningMultiple-choice | CodeCode Available | 0 |
| Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding | Jan 10, 2025 | Automatic Speech RecognitionClassification | CodeCode Available | 0 |