| ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models | Jul 5, 2024 | HallucinationLong Form Question Answering | CodeCode Available | 2 |
| KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques | Mar 9, 2024 | Knowledge GraphsLong Form Question Answering | CodeCode Available | 2 |
| Fine-Grained Human Feedback Gives Better Rewards for Language Model Training | Jun 2, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| WebCPM: Interactive Web Search for Chinese Long-form Question Answering | May 11, 2023 | FormInformation Retrieval | CodeCode Available | 2 |
| LongForm: Effective Instruction Tuning with Reverse Instructions | Apr 17, 2023 | Long Form Question AnsweringNews Generation | CodeCode Available | 2 |
| OLAPH: Improving Factuality in Biomedical Long-form Question Answering | May 21, 2024 | FormLong Form Question Answering | CodeCode Available | 1 |
| CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems | Apr 2, 2024 | FormLong Form Question Answering | CodeCode Available | 1 |
| Attribute First, then Generate: Locally-attributable Grounded Text Generation | Mar 25, 2024 | AttributeDocument Summarization | CodeCode Available | 1 |
| ALaRM: Align Language Models via Hierarchical Rewards Modeling | Mar 11, 2024 | Long Form Question AnsweringMachine Translation | CodeCode Available | 1 |
| SEMQA: Semi-Extractive Multi-Source Question Answering | Nov 8, 2023 | AttributeLong Form Question Answering | CodeCode Available | 1 |
| Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization | Oct 5, 2023 | AllLanguage Modeling | CodeCode Available | 1 |
| A Critical Evaluation of Evaluations for Long-form Question Answering | May 29, 2023 | FormLong Form Question Answering | CodeCode Available | 1 |
| Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive Tasks | Apr 28, 2023 | Fact CheckingInformation Retrieval | CodeCode Available | 1 |
| D2S: Document-to-Slide Generation Via Query-Based Text Summarization | May 8, 2021 | BenchmarkingLong Form Question Answering | CodeCode Available | 1 |
| Controllable Generation from Pre-trained Language Models via Inverse Prompting | Mar 19, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Hurdles to Progress in Long-form Question Answering | Mar 10, 2021 | FormLong Form Question Answering | CodeCode Available | 1 |
| ELI5: Long Form Question Answering | Jul 22, 2019 | FormLanguage Modeling | CodeCode Available | 1 |
| GenerationPrograms: Fine-grained Attribution with Executable Programs | Jun 17, 2025 | Document SummarizationLong Form Question Answering | CodeCode Available | 0 |
| Improving Reliability and Explainability of Medical Question Answering through Atomic Fact Checking in Retrieval-Augmented LLMs | May 30, 2025 | Fact CheckingHallucination | —Unverified | 0 |
| LaMP-QA: A Benchmark for Personalized Long-form Question Answering | May 30, 2025 | Answer GenerationForm | —Unverified | 0 |
| Atomic Consistency Preference Optimization for Long-Form Question Answering | May 14, 2025 | FormLong Form Question Answering | CodeCode Available | 0 |
| An Empirical Study of Evaluating Long-form Question Answering | Apr 25, 2025 | FormInformativeness | CodeCode Available | 0 |
| MAMM-Refine: A Recipe for Improving Faithfulness in Generation with Multi-Agent Collaboration | Mar 19, 2025 | Long Form Question AnsweringQuestion Answering | —Unverified | 0 |
| Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution | Mar 3, 2025 | counterfactualDomain Adaptation | —Unverified | 0 |
| On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems | Feb 20, 2025 | Long Form Question AnsweringQuestion Answering | CodeCode Available | 0 |
| How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild | Feb 18, 2025 | ArticlesHallucination | CodeCode Available | 0 |
| SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models | Feb 13, 2025 | Long Form Question AnsweringQuestion Answering | CodeCode Available | 0 |
| Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization | Jan 23, 2025 | Long Form Question AnsweringQuestion Answering | —Unverified | 0 |
| To Retrieve or Not to Retrieve? Uncertainty Detection for Dynamic Retrieval Augmented Generation | Jan 16, 2025 | Long Form Question AnsweringQuestion Answering | —Unverified | 0 |
| A Claim Decomposition Benchmark for Long-form Answer Verification | Oct 16, 2024 | FormHallucination | CodeCode Available | 0 |
| Retrieving Contextual Information for Long-Form Question Answering using Weak Supervision | Oct 11, 2024 | FormLong Form Question Answering | —Unverified | 0 |
| CALF: Benchmarking Evaluation of LFQA Using Chinese Examinations | Oct 2, 2024 | BenchmarkingLong Form Question Answering | —Unverified | 0 |
| Ancient Wisdom, Modern Tools: Exploring Retrieval-Augmented LLMs for Ancient Indian Philosophy | Aug 21, 2024 | Information RetrievalLong Form Question Answering | CodeCode Available | 0 |
| Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter | Aug 20, 2024 | Long Form Question AnsweringQuestion Answering | CodeCode Available | 0 |
| Localizing and Mitigating Errors in Long-form Question Answering | Jul 16, 2024 | FormHallucination | CodeCode Available | 0 |
| Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation | Jul 1, 2024 | Fact CheckingLong Form Question Answering | —Unverified | 0 |
| CaLMQA: Exploring culturally specific long-form question answering across 23 languages | Jun 25, 2024 | FormLong Form Question Answering | CodeCode Available | 0 |
| FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering | Jun 19, 2024 | Answer GenerationForm | —Unverified | 0 |
| FinTextQA: A Dataset for Long-form Financial Question Answering | May 16, 2024 | DiversityForm | —Unverified | 0 |
| Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study | Apr 10, 2024 | FormLong Form Question Answering | —Unverified | 0 |
| Learning to Plan and Generate Text with Citations | Apr 4, 2024 | Long Form Question AnsweringQuestion Answering | —Unverified | 0 |
| Multi-Review Fusion-in-Context | Mar 22, 2024 | Long Form Question AnsweringQuestion Answering | —Unverified | 0 |
| Genie: Achieving Human Parity in Content-Grounded Datasets Generation | Jan 25, 2024 | Long Form Question AnsweringQuestion Answering | —Unverified | 0 |
| Reinforcement Replaces Supervision: Query focused Summarization using Deep Reinforcement Learning | Nov 29, 2023 | Deep Reinforcement LearningLong Form Question Answering | CodeCode Available | 0 |
| LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback | Nov 15, 2023 | Long Form Question AnsweringMachine Translation | —Unverified | 0 |
| Long-form Question Answering: An Iterative Planning-Retrieval-Generation Approach | Nov 15, 2023 | FormLong Form Question Answering | —Unverified | 0 |
| Adapting Pre-trained Generative Models for Extractive Question Answering | Nov 6, 2023 | Extractive Question-AnsweringLong Form Question Answering | —Unverified | 0 |
| PreWoMe: Exploiting Presuppositions as Working Memory for Long Form Question Answering | Oct 24, 2023 | FormLong Form Question Answering | —Unverified | 0 |
| Understanding Retrieval Augmentation for Long-Form Question Answering | Oct 18, 2023 | FormLong Form Question Answering | —Unverified | 0 |
| A Novel Computational and Modeling Foundation for Automatic Coherence Assessment | Oct 1, 2023 | 4kLong Form Question Answering | —Unverified | 0 |