Learning Video Context as Interleaved Multimodal Sequences Jul 31, 2024 Language Modeling Language Modelling
Code Code Available 1MultiHateClip: A Multilingual Benchmark Dataset for Hateful Video Detection on YouTube and Bilibili Jul 28, 2024 Hate Speech Detection Video Classification
Code Code Available 1UniForensics: Face Forgery Detection via General Facial Representation Jul 26, 2024 Contrastive Learning DeepFake Detection
— Unverified 0LookupViT: Compressing visual information to a limited number of tokens Jul 17, 2024 Image Captioning image-classification
— Unverified 0Learning Natural Consistency Representation for Face Forgery Video Detection Jul 15, 2024 Representation Learning Video Classification
— Unverified 0Open Vocabulary Multi-Label Video Classification Jul 12, 2024 Action Classification Classification
— Unverified 0PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection Jun 22, 2024 DeepFake Detection Face Swapping
— Unverified 0MU-Bench: A Multitask Multimodal Benchmark for Machine Unlearning Jun 21, 2024 Machine Unlearning parameter-efficient fine-tuning
Code Code Available 0PrAViC: Probabilistic Adaptation Framework for Real-Time Video Classification Jun 17, 2024 Classification Video Classification
— Unverified 0DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark Jun 4, 2024 Action Recognition Knowledge Distillation
— Unverified 0Few-Shot Classification of Interactive Activities of Daily Living (InteractADL) Jun 3, 2024 Few Shot Action Recognition Fine-Grained Image Classification
Code Code Available 0ToxVidLM: A Multimodal Framework for Toxicity Detection in Code-Mixed Videos May 31, 2024 Video Classification
— Unverified 0DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark May 30, 2024 DeepFake Detection Mamba
Code Code Available 2SIAVC: Semi-Supervised Framework for Industrial Accident Video Classification May 23, 2024 Fire Detection Model Optimization
Code Code Available 0A Survey on Visual Mamba Apr 24, 2024 Image Registration Image Restoration
Code Code Available 4MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding Apr 8, 2024 GPU Multiple-choice
Code Code Available 3Learning Correlation Structures for Vision Transformers Apr 5, 2024 Action Classification Action Recognition
— Unverified 0X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization Mar 28, 2024 Video Classification Zero-Shot Learning
Code Code Available 1Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning Mar 27, 2024 Classification image-classification
— Unverified 0Pig aggression classification using CNN, Transformers and Recurrent Networks Mar 13, 2024 Classification Video Classification
— Unverified 0Leveraging Compressed Frame Sizes For Ultra-Fast Video Classification Mar 13, 2024 Dynamic Time Warping Retrieval
— Unverified 0Learning Expressive And Generalizable Motion Features For Face Forgery Detection Mar 8, 2024 Anomaly Detection Classification
— Unverified 0A Multimodal Handover Failure Detection Dataset and Baselines Feb 28, 2024 Action Segmentation Object
Code Code Available 0Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer Feb 14, 2024 Video Classification
Code Code Available 0Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning Feb 9, 2024 Active Learning Video Classification
Code Code Available 2Time-, Memory- and Parameter-Efficient Visual Adaptation Feb 5, 2024 GPU Video Classification
— Unverified 0FakeClaim: A Multiple Platform-driven Dataset for Identification of Fake News on 2023 Israel-Hamas War Jan 29, 2024 Fact Checking Language Modeling
Code Code Available 0Short-Form Videos and Mental Health: A Knowledge-Guided Neural Topic Model Jan 11, 2024 Form Topic Models
— Unverified 0Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification Jan 8, 2024 Action Recognition Contrastive Learning
— Unverified 0Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification Jan 8, 2024 GPU Representation Learning
— Unverified 0Time- Memory- and Parameter-Efficient Visual Adaptation Jan 1, 2024 GPU Video Classification
— Unverified 0Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach Dec 21, 2023 Action Localization Classification
Code Code Available 1InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks Dec 21, 2023 Image Retrieval Image-to-Text Retrieval
Code Code Available 1A Simple Video Segmenter by Tracking Objects Along Axial Trajectories Nov 30, 2023 GPU Object
Code Code Available 1Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments Nov 10, 2023 Activity Recognition Autonomous Driving
Code Code Available 1Neural architecture impact on identifying temporally extended Reinforcement Learning tasks Oct 4, 2023 Deep Reinforcement Learning image-classification
— Unverified 0Revisiting Kernel Temporal Segmentation as an Adaptive Tokenizer for Long-form Video Understanding Sep 20, 2023 Action Localization Form
— Unverified 0Language as the Medium: Multimodal Video Classification through text only Sep 19, 2023 Action Recognition Video Classification
— Unverified 0AV-MaskEnhancer: Enhancing Video Representations through Audio-Visual Masked Autoencoder Sep 15, 2023 Video Classification
— Unverified 0Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval Sep 15, 2023 Retrieval Video Classification
Code Code Available 0Text-to-feature diffusion for audio-visual few-shot learning Sep 7, 2023 Classification Few-Shot Learning
Code Code Available 0Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models Jul 22, 2023 Articles Classification
Code Code Available 0MUVF-YOLOX: A Multi-modal Ultrasound Video Fusion Network for Renal Tumor Diagnosis Jul 15, 2023 Video Classification
Code Code Available 1Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution Jul 12, 2023 Fairness Image Classification
Code Code Available 6The Staged Knowledge Distillation in Video Classification: Harmonizing Student Progress by a Complementary Weakly Supervised Framework Jul 11, 2023 Knowledge Distillation Pseudo Label
— Unverified 0Active Learning for Video Classification with Frame Level Queries Jul 10, 2023 Active Learning Classification
— Unverified 0Learning Unseen Modality Interaction Jun 22, 2023 Retrieval Video Classification
Code Code Available 0Boosting Breast Ultrasound Video Classification by the Guidance of Keyframe Feature Centers Jun 12, 2023 Video Classification
— Unverified 0Inflated 3D Convolution-Transformer for Weakly-supervised Carotid Stenosis Grading with Ultrasound Videos Jun 5, 2023 Video Classification
Code Code Available 0Multi-label Video Classification for Underwater Ship Inspection May 27, 2023 Classification Video Classification
— Unverified 0