Live Video Captioning
Live video captioning (LVC) involves detecting and describing dense events within video streams. Traditional dense video captioning approaches typically focus on offline solutions where the entire video is available for analysis by the captioning model. In contrast, the LVC paradigm requires models to generate captions for video streams in an online manner. This imposes significant constraints, such as working with incomplete observations of the video and the need for temporal anticipation.
Papers
No papers found.
No leaderboard results yet.