| VANER: Leveraging Large Language Model for Versatile and Adaptive Biomedical Named Entity Recognition | Apr 27, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Variability-Driven User-Story Generation using LLM and Triadic Concept Analysis | Apr 11, 2025 | Large Language ModelStory Generation | —Unverified | 0 | 0 |
| Rethinking the Instruction Quality: LIFT is What You Need | Dec 12, 2023 | Code GenerationInstruction Following | —Unverified | 0 | 0 |
| VayuBuddy: an LLM-Powered Chatbot to Democratize Air Quality Insights | Nov 16, 2024 | ChatbotLanguage Modeling | —Unverified | 0 | 0 |
| VCounselor: A Psychological Intervention Chat Agent Based on a Knowledge-Enhanced Large Language Model | Mar 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| VELO: A Vector Database-Assisted Cloud-Edge Collaborative LLM QoS Optimization Framework | Jun 19, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| Verifiable Format Control for Large Language Model Generations | Feb 6, 2025 | BenchmarkingInstruction Following | —Unverified | 0 | 0 |
| VeriGen: A Large Language Model for Verilog Code Generation | Jul 28, 2023 | Code GenerationLanguage Modeling | —Unverified | 0 | 0 |
| VeriLA: A Human-Centered Evaluation Framework for Interpretable Verification of LLM Agent Failures | Mar 16, 2025 | Human Agent CollaborationLarge Language Model | —Unverified | 0 | 0 |
| VGR: Visual Grounded Reasoning | Jun 13, 2025 | Large Language ModelMath | —Unverified | 0 | 0 |
| ViDAS: Vision-based Danger Assessment and Scoring | Oct 1, 2024 | Fixed Few Shot PromptingFixed Few Shot Prompting Danger Assessment | —Unverified | 0 | 0 |
| Video-Bench: Human-Aligned Video Generation Benchmark | Jan 1, 2025 | Large Language ModelVideo Generation | —Unverified | 0 | 0 |
| Video Emotion Open-vocabulary Recognition Based on Multimodal Large Language Model | Aug 21, 2024 | Emotion RecognitionLanguage Modeling | —Unverified | 0 | 0 |
| Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models | Jul 8, 2025 | Future predictionLarge Language Model | —Unverified | 0 | 0 |
| VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos | Nov 7, 2024 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos | Jan 1, 2025 | Large Language ModelVideo Segmentation | —Unverified | 0 | 0 |
| VideoLLM-online: Online Video Large Language Model for Streaming Video | Jun 17, 2024 | GPULanguage Modeling | —Unverified | 0 | 0 |
| Video LLMs for Temporal Reasoning in Long Videos | Dec 4, 2024 | Action SegmentationDense Video Captioning | —Unverified | 0 | 0 |
| Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition | May 7, 2024 | Large Language ModelMultimodal Large Language Model | —Unverified | 0 | 0 |
| VideoOrion: Tokenizing Object Dynamics in Videos | Nov 25, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| VideoPoet: A Large Language Model for Zero-Shot Video Generation | Dec 21, 2023 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| Video Summarization with Large Language Models | Apr 15, 2025 | Large Language ModelVideo Summarization | —Unverified | 0 | 0 |
| Video-VoT-R1: An efficient video inference model integrating image packing and AoE architecture | Mar 20, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| ViLLM-Eval: A Comprehensive Evaluation Suite for Vietnamese Large Language Models | Apr 17, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Vi-Mistral-X: Building a Vietnamese Language Model with Advanced Continual Pre-training | Mar 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| VinaLLaMA: LLaMA-based Vietnamese Foundation Model | Dec 18, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese | Aug 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| ViPer: Visual Personalization of Generative Models via Individual Preference Learning | Jul 24, 2024 | Image GenerationLanguage Modeling | —Unverified | 0 | 0 |
| VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models | Feb 14, 2025 | Image CaptioningLarge Language Model | —Unverified | 0 | 0 |
| Vision and Intention Boost Large Language Model in Long-Term Action Anticipation | May 3, 2025 | Action AnticipationIn-Context Learning | —Unverified | 0 | 0 |
| Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning | Feb 19, 2025 | Common Sense ReasoningLanguage Modeling | —Unverified | 0 | 0 |
| Vision-centric Token Compression in Large Language Model | Feb 2, 2025 | In-Context LearningLanguage Modeling | —Unverified | 0 | 0 |
| VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework | Mar 14, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation | Feb 6, 2025 | Autonomous DrivingDecision Making | —Unverified | 0 | 0 |
| Vision-Language Models Represent Darker-Skinned Black Individuals as More Homogeneous than Lighter-Skinned Black Individuals | Dec 12, 2024 | Image CaptioningImage Generation | —Unverified | 0 | 0 |
| VisionLLM-based Multimodal Fusion Network for Glottic Carcinoma Early Detection | Dec 24, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| [Vision Paper] PRObot: Enhancing Patient-Reported Outcome Measures for Diabetic Retinopathy using Chatbots and Generative AI | Nov 5, 2024 | ChatbotLanguage Modeling | —Unverified | 0 | 0 |
| VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions | Jul 17, 2024 | Autonomous VehiclesLanguage Modeling | —Unverified | 0 | 0 |
| Visual Adversarial Attack on Vision-Language Models for Autonomous Driving | Nov 27, 2024 | Adversarial AttackAutonomous Driving | —Unverified | 0 | 0 |
| Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning | Jun 3, 2022 | Image Paragraph CaptioningLanguage Modeling | —Unverified | 0 | 0 |
| Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval | Apr 23, 2024 | Image RetrievalLanguage Modeling | —Unverified | 0 | 0 |
| Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation | May 23, 2024 | Audio GenerationDenoising | —Unverified | 0 | 0 |
| Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation | Apr 30, 2024 | Caption GenerationHallucination | —Unverified | 0 | 0 |
| Visual grounding for desktop graphical user interfaces | May 5, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models | Dec 5, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks | Feb 13, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Visual Text Generation in the Wild | Jul 19, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation | Oct 11, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 | 0 |
| VL-Mamba: Exploring State Space Models for Multimodal Learning | Mar 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| VLMaterial: Procedural Material Generation with Large Vision-Language Models | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |