SOTAVerified

Language Modeling

Papers

Showing 59516000 of 14182 papers

TitleStatusHype
VIANA: Visual Interactive Annotation of Argumentation0
ViDAS: Vision-based Danger Assessment and Scoring0
Video Captioning with Boundary-aware Hierarchical Language Decoding and Joint Video Prediction0
Video Description: A Survey of Methods, Datasets and Evaluation Metrics0
Video Emotion Open-vocabulary Recognition Based on Multimodal Large Language Model0
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos0
Video Imprint0
Video Language Model Pretraining with Spatio-temporal Masking0
VideoLLM-online: Online Video Large Language Model for Streaming Video0
VideoOrion: Tokenizing Object Dynamics in Videos0
VideoPoet: A Large Language Model for Zero-Shot Video Generation0
Video-VoT-R1: An efficient video inference model integrating image packing and AoE architecture0
VidLPRO: A Video-Language Pre-training Framework for Robotic and Laparoscopic Surgery0
ViLAaD: Enhancing "Attracting and Dispersing'' Source-Free Domain Adaptation with Vision-and-Language Model0
ViLLM-Eval: A Comprehensive Evaluation Suite for Vietnamese Large Language Models0
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models0
ViLTA: Enhancing Vision-Language Pre-training through Textual Augmentation0
Vi-Mistral-X: Building a Vietnamese Language Model with Advanced Continual Pre-training0
VinaLLaMA: LLaMA-based Vietnamese Foundation Model0
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese0
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation0
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder0
ViPer: Visual Personalization of Generative Models via Individual Preference Learning0
Virtual Scientific Companion for Synchrotron Beamlines: A Prototype0
Vision and Intention Boost Large Language Model in Long-Term Action Anticipation0
Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning0
Vision-centric Token Compression in Large Language Model0
VisionGPT: Vision-Language Understanding Agent Using Generalized Multimodal Framework0
Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation0
Vision-Language Adaptive Mutual Decoder for OOV-STR0
Vision-language Assisted Attribute Learning0
Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction0
Vision-Language Model-Based Semantic-Guided Imaging Biomarker for Early Lung Cancer Detection0
Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces0
Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives0
Vision Language Modeling of Content, Distortion and Appearance for Image Quality Assessment0
Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft0
Vision-Language Model IP Protection via Prompt-based Learning0
Vision Language Transformers: A Survey0
VisionLLM-based Multimodal Fusion Network for Glottic Carcinoma Early Detection0
[Vision Paper] PRObot: Enhancing Patient-Reported Outcome Measures for Diabetic Retinopathy using Chatbots and Generative AI0
VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions0
A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction0
Visual attention models for scene text recognition0
Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning0
Visual Comparison of Language Model Adaptation0
Visual Conceptual Blending with Large-scale Language and Vision Models0
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval0
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation0
Visual Features for Context-Aware Speech Recognition0
Show:102550
← PrevPage 120 of 284Next →

No leaderboard results yet.