InternVideo2: Scaling Foundation Models for Multimodal Video Understanding Mar 22, 2024 Action Classification Action Recognition
Code Code Available 7LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment Oct 3, 2023 Audio Classification Contrastive Learning
Code Code Available 4Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics Apr 25, 2024 Audio Classification Transfer Learning
Code Code Available 3Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models Nov 14, 2023 Acoustic Scene Classification Audio captioning
Code Code Available 3ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities May 18, 2023 1 Image, 2*2 Stitchi Action Classification
Code Code Available 3CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification Mar 13, 2022 Audio Classification Knowledge Distillation
Code Code Available 3EAT: Self-Supervised Pre-Training with Efficient Audio Transformer Jan 7, 2024 Audio Classification Self-Supervised Learning
Code Code Available 3Global birdsong embeddings enable superior transfer learning for bioacoustic classification Jul 12, 2023 Audio Classification Decision Making
Code Code Available 2HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection Feb 2, 2022 Audio Classification Event Detection
Code Code Available 2Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition Jan 4, 2024 Attribute Audio Classification
Code Code Available 2SSAST: Self-Supervised Audio Spectrogram Transformer Oct 19, 2021 Audio Classification Classification
Code Code Available 2Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation Nov 9, 2022 Audio Classification Audio Tagging
Code Code Available 2Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models Jan 16, 2023 Audio Classification Few-Shot Learning
Code Code Available 2Benchmarking Representations for Speech, Music, and Acoustic Events May 2, 2024 Audio Classification Benchmarking
Code Code Available 2BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics Mar 15, 2024 Audio Classification Classification
Code Code Available 2Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models Oct 24, 2023 Audio Classification Audio Tagging
Code Code Available 2Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio Feb 14, 2024 Audio Classification Decoder
Code Code Available 2AST: Audio Spectrogram Transformer Apr 5, 2021 Audio Classification Audio Tagging
Code Code Available 2SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model May 20, 2024 Audio Classification GPU
Code Code Available 2Contrastive Audio-Visual Masked Autoencoder Oct 2, 2022 Audio Classification Audio Tagging
Code Code Available 2Audio Mamba: Bidirectional State Space Model for Audio Representation Learning Jun 5, 2024 Audio Classification Classification
Code Code Available 2Federated Self-Training for Semi-Supervised Audio Recognition Jul 14, 2021 Audio Classification Federated Learning
Code Code Available 1Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions Jun 23, 2024 Audio Classification Parkinson Detection from Speech
Code Code Available 1Few-shot Class-incremental Audio Classification Using Stochastic Classifier Jun 3, 2023 Audio Classification Classification
Code Code Available 1Efficient Training of Audio Transformers with Patchout Oct 11, 2021 Acoustic Scene Classification Audio Classification
Code Code Available 1Adaptive Differential Denoising for Respiratory Sounds Classification Jun 3, 2025 Audio Classification Classification
Code Code Available 1Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained Devices Mar 5, 2021 Audio Classification Environmental Sound Classification
Code Code Available 1EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning Mar 14, 2024 Audio Classification audio-visual learning
Code Code Available 1animal2vec and MeerKAT: A self-supervised transformer for rare-event raw audio input and a large-scale reference dataset for bioacoustics Jun 3, 2024 Audio Classification Benchmarking
Code Code Available 1ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions Jul 11, 2024 All Audio Classification
Code Code Available 1Effective Audio Classification Network Based on Paired Inverse Pyramid Structure and Dense MLP Block Nov 5, 2022 Audio Classification Classification
Code Code Available 1AUCO ResNet: an end-to-end network for Covid-19 pre-screening from cough and breath Mar 15, 2022 8k Audio Classification
Code Code Available 1EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use Jul 12, 2022 Audio Classification Classification
Code Code Available 1End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network Apr 25, 2022 Audio Classification Classification
Code Code Available 1Fluctuation-driven initialization for spiking neural network training Jun 21, 2022 Audio Classification
Code Code Available 1DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection Jun 29, 2021 Audio Classification Direction of Arrival Estimation
Code Code Available 1CRNNs for Urban Sound Tagging with spatiotemporal context Aug 24, 2020 Audio Classification Audio Tagging
Code Code Available 1Device-Robust Acoustic Scene Classification via Impulse Response Augmentation May 12, 2023 Acoustic Scene Classification Audio Classification
Code Code Available 1A surrogate gradient spiking baseline for speech command recognition Aug 22, 2022 Audio Classification Time Series
Code Code Available 1Acoustic Prompt Tuning: Empowering Large Language Models with Audition Capabilities Nov 30, 2023 Audio Classification Few-Shot Audio Classification
Code Code Available 1Connecting the Dots between Audio and Text without Parallel Data through Visual Knowledge Transfer Dec 16, 2021 Audio Classification Audio Tagging
Code Code Available 1Adversarial Fine-tuning using Generated Respiratory Sound to Address Class Imbalance Nov 11, 2023 Audio Classification Sound Classification
Code Code Available 1CNN Architectures for Large-Scale Audio Classification Sep 29, 2016 Audio Classification Event Detection
Code Code Available 1ATST: Audio Representation Learning with Teacher-Student Transformer Apr 26, 2022 Audio Classification Instrument Recognition
Code Code Available 1CycleGuardian: A Framework for Automatic RespiratorySound classification Based on Improved Deep clustering and Contrastive Learning Feb 2, 2025 Audio Classification Clustering
Code Code Available 1DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners Jul 4, 2024 Audio Classification Audio Tagging
Code Code Available 1Continual Transformers: Redundancy-Free Attention for Online Inference Jan 17, 2022 Action Detection Audio Classification
Code Code Available 1Audio Tagging on an Embedded Hardware Platform Jun 15, 2023 Audio Classification Audio Tagging
Code Code Available 1DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification Mar 24, 2024 Audio Classification Information Retrieval
Code Code Available 1CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition Oct 18, 2023 Audio Classification Contrastive Learning
Code Code Available 1