| CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy | Dec 3, 2024 | HallucinationKey Information Extraction | —Unverified | 0 |
| AI Benchmarks and Datasets for LLM Evaluation | Dec 2, 2024 | BenchmarkingDistributed Computing | —Unverified | 0 |
| Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment | Dec 1, 2024 | Action DetectionActivity Detection | CodeCode Available | 0 |
| Beyond Logit Lens: Contextual Embeddings for Robust Hallucination Detection & Grounding in VLMs | Nov 28, 2024 | AttributeHallucination | —Unverified | 0 |
| DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models | Nov 27, 2024 | AttributeHallucination | —Unverified | 0 |
| OPCap:Object-aware Prompting Captioning | Nov 27, 2024 | AttributeDecoder | —Unverified | 0 |
| Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach | Nov 26, 2024 | Hallucination | —Unverified | 0 |
| A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs | Nov 26, 2024 | Hallucination | —Unverified | 0 |
| VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models | Nov 26, 2024 | Hallucination | —Unverified | 0 |
| AI2T: Building Trustable AI Tutors by Interactively Teaching a Self-Aware Learning Agent | Nov 26, 2024 | Hallucination | —Unverified | 0 |