| A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems | Jun 21, 2024 | RAGRetrieval | CodeCode Available | 0 | 5 |
| A Human-AI Comparative Analysis of Prompt Sensitivity in LLM-Based Relevance Judgment | Apr 16, 2025 | Information RetrievalRAG | CodeCode Available | 0 | 5 |
| A System for Comprehensive Assessment of RAG Frameworks | Apr 10, 2025 | RAGRetrieval | CodeCode Available | 0 | 5 |
| A Comparison of Methods for Evaluating Generative IR | Apr 5, 2024 | Information RetrievalLanguage Modelling | CodeCode Available | 0 | 5 |
| On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems | Feb 20, 2025 | Long Form Question AnsweringQuestion Answering | CodeCode Available | 0 | 5 |
| Conversational Gold: Evaluating Personalized Conversational Search System using Gold Nuggets | Mar 12, 2025 | Answer GenerationConversational Search | CodeCode Available | 0 | 5 |
| NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering | Feb 15, 2025 | ChunkingInformation Retrieval | CodeCode Available | 0 | 5 |
| NeoQA: Evidence-based Question Answering with Generated News Events | May 9, 2025 | ArticlesQuestion Answering | CodeCode Available | 0 | 5 |
| QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option Shuffling | Sep 21, 2024 | Multiple-choicePrompt Engineering | CodeCode Available | 0 | 5 |
| MuseRAG: Idea Originality Scoring At Scale | May 22, 2025 | RAGRetrieval-augmented Generation | CodeCode Available | 0 | 5 |
| Network-informed Prompt Engineering against Organized Astroturf Campaigns under Extreme Class Imbalance | Jan 21, 2025 | Data AugmentationLanguage Modeling | CodeCode Available | 0 | 5 |
| Not All Languages are Equal: Insights into Multilingual Retrieval-Augmented Generation | Oct 29, 2024 | AllRetrieval | CodeCode Available | 0 | 5 |
| A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia | Dec 4, 2023 | counterfactualLanguage Modeling | CodeCode Available | 0 | 5 |
| Mix-of-Granularity: Optimize the Chunking Granularity for Retrieval-Augmented Generation | Jun 1, 2024 | ChunkingRAG | CodeCode Available | 0 | 5 |
| Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework | Sep 24, 2024 | Benchmarkingcounterfactual | CodeCode Available | 0 | 5 |
| Mitigating Bias in RAG: Controlling the Embedder | Feb 24, 2025 | FairnessRAG | CodeCode Available | 0 | 5 |
| Consistent Autoformalization for Constructing Mathematical Libraries | Oct 5, 2024 | DenoisingRAG | CodeCode Available | 0 | 5 |
| ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges | Dec 6, 2024 | BenchmarkingRetrieval | CodeCode Available | 0 | 5 |
| MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge | Dec 22, 2024 | Multi-hop Question AnsweringQuestion Answering | CodeCode Available | 0 | 5 |
| Unipa-GPT: Large Language Models for university-oriented QA in Italian | Jul 19, 2024 | ChatbotInformation Retrieval | CodeCode Available | 0 | 5 |
| Concurrent Brainstorming & Hypothesis Satisfying: An Iterative Framework for Enhanced Retrieval-Augmented Generation (R2CBR3H-SR) | Jan 3, 2024 | Decision MakingInformation Retrieval | CodeCode Available | 0 | 5 |
| Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning | Jun 5, 2025 | Question AnsweringRAG | CodeCode Available | 0 | 5 |
| Memorization and Knowledge Injection in Gated LLMs | Apr 30, 2025 | Continual LearningMemorization | CodeCode Available | 0 | 5 |
| Memory and Knowledge Augmented Language Models for Inferring Salience in Long-Form Stories | Sep 8, 2021 | FormLanguage Modeling | CodeCode Available | 0 | 5 |
| MEMERAG: A Multilingual End-to-End Meta-Evaluation Benchmark for Retrieval Augmented Generation | Feb 24, 2025 | RAGRetrieval | CodeCode Available | 0 | 5 |