| JointRank: Rank Large Set with Single Pass | Jun 27, 2025 | Information RetrievalReranking | CodeCode Available | 0 |
| JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability | Feb 27, 2024 | GPUInformation Retrieval | CodeCode Available | 0 |
| Automated Bias Assessment in AI-Generated Educational Content Using CEAT Framework | May 19, 2025 | FairnessRetrieval-augmented Generation | CodeCode Available | 0 |
| AIC CTU system at AVeriTeC: Re-framing automated fact-checking as a simple RAG task | Oct 15, 2024 | Data AugmentationFact Checking | CodeCode Available | 0 |
| Quebec Automobile Insurance Question-Answering With Retrieval-Augmented Generation | Oct 12, 2024 | Question AnsweringRAG | CodeCode Available | 0 |
| Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems | Sep 29, 2024 | FairnessOpen-Domain Question Answering | CodeCode Available | 0 |
| A Hybrid Approach to Information Retrieval and Answer Generation for Regulatory Texts | Feb 24, 2025 | Answer GenerationInformation Retrieval | CodeCode Available | 0 |
| THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models | Sep 17, 2024 | BenchmarkingBinary Classification | CodeCode Available | 0 |
| Can LLMs reason over extended multilingual contexts? Towards long-context evaluation beyond retrieval and haystacks | Apr 17, 2025 | Epistemic ReasoningLarge Language Model | CodeCode Available | 0 |
| Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings | Mar 19, 2025 | Instruction FollowingLarge Language Model | CodeCode Available | 0 |