| SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression | Jul 8, 2025 | Evidence SelectionRAG | —Unverified | 0 |
| Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering | May 22, 2025 | BenchmarkingEvidence Selection | CodeCode Available | 1 |
| Knowledge-Aware Iterative Retrieval for Multi-Agent Systems | Mar 17, 2025 | Evidence SelectionLarge Language Model | —Unverified | 0 |
| Enhancing Large Language Models with Domain-specific Retrieval Augment Generation: A Case Study on Long-form Consumer Health Question Answering in Ophthalmology | Sep 20, 2024 | Evidence SelectionForm | —Unverified | 0 |
| Halu-J: Critique-Based Hallucination Judge | Jul 17, 2024 | Evidence SelectionHallucination | CodeCode Available | 4 |
| Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering | Feb 26, 2024 | Evidence SelectionOpen-Ended Question Answering | CodeCode Available | 4 |
| Comparing Knowledge Sources for Open-Domain Scientific Claim Verification | Feb 5, 2024 | Claim VerificationEvidence Selection | —Unverified | 0 |
| Do We Need Language-Specific Fact-Checking Models? The Case of Chinese | Jan 27, 2024 | Evidence SelectionFact Checking | —Unverified | 0 |
| Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond | Jun 16, 2023 | BenchmarkingEvidence Selection | CodeCode Available | 1 |
| SemEval-2023 Task 7: Multi-Evidence Natural Language Inference for Clinical Trial Data | May 4, 2023 | Evidence SelectionNatural Language Inference | —Unverified | 0 |