| Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses | Oct 15, 2024 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| VERIFIED: A Video Corpus Moment Retrieval Benchmark for Fine-Grained Video Understanding | Oct 11, 2024 | HallucinationMoment Retrieval | CodeCode Available | 1 |
| Automatic Curriculum Expert Iteration for Reliable LLM Reasoning | Oct 10, 2024 | HallucinationLogical Reasoning | CodeCode Available | 1 |
| OneNet: A Fine-Tuning Free Framework for Few-Shot Entity Linking via Large Language Model Prompting | Oct 10, 2024 | Entity LinkingFew-Shot Learning | CodeCode Available | 1 |
| IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking | Oct 9, 2024 | ARCCode Generation | CodeCode Available | 1 |
| CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation | Oct 3, 2024 | Abstractive Text SummarizationHallucination | CodeCode Available | 1 |
| FactAlign: Long-form Factuality Alignment of Large Language Models | Oct 2, 2024 | FormHallucination | CodeCode Available | 1 |
| EventHallusion: Diagnosing Event Hallucinations in Video LLMs | Sep 25, 2024 | HallucinationInstruction Following | CodeCode Available | 1 |
| XTRUST: On the Multilingual Trustworthiness of Large Language Models | Sep 24, 2024 | EthicsFairness | CodeCode Available | 1 |
| FAIR GPT: A virtual consultant for research data management in ChatGPT | Sep 20, 2024 | FairnessHallucination | CodeCode Available | 1 |