| NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering | Feb 15, 2025 | ChunkingInformation Retrieval | CodeCode Available | 0 | 5 |
| NeoQA: Evidence-based Question Answering with Generated News Events | May 9, 2025 | ArticlesQuestion Answering | CodeCode Available | 0 | 5 |
| Network-informed Prompt Engineering against Organized Astroturf Campaigns under Extreme Class Imbalance | Jan 21, 2025 | Data AugmentationLanguage Modeling | CodeCode Available | 0 | 5 |
| QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling | Sep 21, 2024 | Multiple-choicePrompt Engineering | CodeCode Available | 0 | 5 |
| MuseRAG: Idea Originality Scoring At Scale | May 22, 2025 | RAGRetrieval-augmented Generation | CodeCode Available | 0 | 5 |
| A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia | Dec 4, 2023 | counterfactualLanguage Modeling | CodeCode Available | 0 | 5 |
| MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge | Dec 22, 2024 | Multi-hop Question AnsweringQuestion Answering | CodeCode Available | 0 | 5 |
| Mitigating Bias in RAG: Controlling the Embedder | Feb 24, 2025 | FairnessRAG | CodeCode Available | 0 | 5 |
| Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning | Jun 5, 2025 | Question AnsweringRAG | CodeCode Available | 0 | 5 |
| Consistent Autoformalization for Constructing Mathematical Libraries | Oct 5, 2024 | DenoisingRAG | CodeCode Available | 0 | 5 |