| AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models | Oct 23, 2024 | Hallucination | —Unverified | 0 | 0 |
| AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation | Feb 25, 2024 | Face GenerationHallucination | —Unverified | 0 | 0 |
| 'Beach' to 'Bitch': Inadvertent Unsafe Transcription of Kids' Content on YouTube | Feb 17, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models | Jul 18, 2024 | HallucinationLanguage Modelling | —Unverified | 0 | 0 |
| Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies | Mar 10, 2025 | BenchmarkingEthics | —Unverified | 0 | 0 |
| Benchmarking large language models for materials synthesis: the case of atomic layer deposition | Dec 13, 2024 | BenchmarkingHallucination | —Unverified | 0 | 0 |
| Benchmarking Retrieval-Augmented Large Language Models in Biomedical NLP: Application, Robustness, and Self-Awareness | May 13, 2024 | Benchmarkingcounterfactual | —Unverified | 0 | 0 |
| Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs | Nov 28, 2024 | AttributeHallucination | —Unverified | 0 | 0 |
| Beyond the Black Box: Interpretability of LLMs in Finance | May 14, 2025 | FairnessHallucination | —Unverified | 0 | 0 |
| Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models | Jun 18, 2024 | Hallucination | —Unverified | 0 | 0 |