| Gesture-Aware Zero-Shot Speech Recognition for Patients with Language Disorders | Feb 18, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation | Nov 25, 2023 | Instruction FollowingLanguage Modeling | —Unverified | 0 | 0 |
| Graph-based Unsupervised Disentangled Representation Learning via Multimodal Large Language Models | Jul 26, 2024 | DisentanglementLanguage Modeling | —Unverified | 0 | 0 |
| GroundingFace: Fine-grained Face Understanding via Pixel Grounding Multimodal Large Language Model | Jan 1, 2025 | AttributeLanguage Modeling | —Unverified | 0 | 0 |
| Guard Me If You Know Me: Protecting Specific Face-Identity from Deepfakes | May 26, 2025 | DeepFake DetectionFace Generation | —Unverified | 0 | 0 |
| Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models | Jun 24, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| GUIDE: Graphical User Interface Data for Execution | Apr 9, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| Hear Me, See Me, Understand Me: Audio-Visual Autism Behavior Recognition | Mar 22, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model | Mar 17, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Highlighting What Matters: Promptable Embeddings for Attribute-Focused Image Retrieval | May 21, 2025 | AttributeImage Retrieval | —Unverified | 0 | 0 |