DriVLM: Domain Adaptation of Vision-Language Models in Autonomous Driving

2025-01-09Unverified0· sign in to hype

Xuran Zheng, Chang D. Yoo

Unverified — Be the first to reproduce this paper.

Abstract

In recent years, large language models have had a very impressive performance, which largely contributed to the development and application of artificial intelligence, and the parameters and performance of the models are still growing rapidly. In particular, multimodal large language models (MLLM) can combine multiple modalities such as pictures, videos, sounds, texts, etc., and have great potential in various tasks. However, most MLLMs require very high computational resources, which is a major challenge for most researchers and developers. In this paper, we explored the utility of small-scale MLLMs and applied small-scale MLLMs to the field of autonomous driving. We hope that this will advance the application of MLLMs in real-world scenarios.

Tasks

Autonomous Driving Domain Adaptation

DriVLM: Domain Adaptation of Vision-Language Models in Autonomous Driving

Abstract

Tasks

Reproductions