| MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark | Jun 5, 2025 | RhythmSpoken Language Understanding | CodeCode Available | 7 |
| OpenVoice: Versatile Instant Voice Cloning | Dec 3, 2023 | RhythmVoice Cloning | CodeCode Available | 7 |
| Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play | May 5, 2025 | AI AgentAutomatic Speech Recognition | CodeCode Available | 3 |
| TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control | Sep 24, 2024 | ClusteringLanguage Modelling | CodeCode Available | 3 |
| SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition | Feb 27, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| FlashSpeech: Efficient Zero-Shot Speech Synthesis | Apr 23, 2024 | RhythmSpeech Synthesis | CodeCode Available | 3 |
| EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling | Dec 31, 2023 | 3D Face AnimationDiversity | CodeCode Available | 3 |
| Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis | May 16, 2024 | Language ModellingLarge Language Model | CodeCode Available | 3 |
| MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models | Mar 14, 2024 | 3D Face AnimationDiversity | CodeCode Available | 2 |
| Diff-BGM: A Diffusion Model for Video Background Music Generation | May 20, 2024 | DiversityMusic Generation | CodeCode Available | 2 |
| AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion | Jun 1, 2024 | Gesture GenerationRhythm | CodeCode Available | 2 |
| MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation | Jul 21, 2024 | DiversityMusic Generation | CodeCode Available | 2 |
| Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation | Aug 5, 2024 | RhythmSelf-Supervised Learning | CodeCode Available | 2 |
| Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings | Oct 4, 2022 | Gesture GenerationRhythm | CodeCode Available | 2 |
| An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains | Oct 5, 2024 | DiagnosticEvent Detection | CodeCode Available | 2 |
| SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems | Jan 8, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Unsupervised Speech Decomposition via Triple Information Bottleneck | Apr 23, 2020 | RhythmStyle Transfer | CodeCode Available | 2 |
| Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling | Jul 29, 2020 | ClusteringDisentanglement | CodeCode Available | 1 |
| MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation | Sep 19, 2023 | Rhythm | CodeCode Available | 1 |
| Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning | Sep 30, 2022 | ECG ClassificationKnowledge Distillation | CodeCode Available | 1 |
| M-Arg: Multimodal Argument Mining Dataset for Political Debates with Audio and Transcripts | Nov 1, 2021 | Argument MiningRhythm | CodeCode Available | 1 |
| Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | Nov 23, 2023 | Automatic Lyrics TranscriptionRhythm | CodeCode Available | 1 |
| IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification | Apr 6, 2022 | ECG ClassificationRhythm | CodeCode Available | 1 |
| LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation | Sep 17, 2023 | Gesture GenerationRhythm | CodeCode Available | 1 |
| LoopNet: Musical Loop Synthesis Conditioned On Intuitive Musical Parameters | May 21, 2021 | Information RetrievalMusic Information Retrieval | CodeCode Available | 1 |
| Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens | Oct 26, 2019 | RhythmStyle Transfer | CodeCode Available | 1 |
| Music SketchNet: Controllable Music Generation via Factorized Representations of Pitch and Rhythm | Aug 4, 2020 | Music GenerationRhythm | CodeCode Available | 1 |
| EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation | May 30, 2023 | Gesture GenerationRhythm | CodeCode Available | 1 |
| Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly Detection | Aug 3, 2023 | Anomaly DetectionDiagnostic | CodeCode Available | 1 |
| Music ControlNet: A model similar to SD ControlNetD that can accurately control music generation | Nov 7, 2023 | Music GenerationRhythm | CodeCode Available | 1 |
| Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion | Dec 7, 2023 | Gesture GenerationRhythm | CodeCode Available | 1 |
| GenéLive! Generating Rhythm Actions in Love Live! | Feb 25, 2022 | Rhythm | CodeCode Available | 1 |
| ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis | Feb 16, 2025 | DiagnosticRhythm | CodeCode Available | 1 |
| AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks | Jul 19, 2023 | RhythmSemantic correspondence | CodeCode Available | 1 |
| Development of Interpretable Machine Learning Models to Detect Arrhythmia based on ECG Data | May 5, 2022 | BIG-bench Machine LearningFeature Importance | CodeCode Available | 1 |
| Generalizing electrocardiogram delineation -- Training convolutional neural networks with synthetic data augmentation | Nov 25, 2021 | Data AugmentationRhythm | CodeCode Available | 1 |
| DanceFormer: Music Conditioned 3D Dance Generation with Parametric Motion Transformer | Mar 18, 2021 | Rhythm | CodeCode Available | 1 |
| DanceIt: Music-inspired Dancing Video Synthesis | Sep 17, 2020 | cross-modal alignmentRhythm | CodeCode Available | 1 |
| DEEPCHORUS: A Hybrid Model of Multi-scale Convolution and Self-attention for Chorus Detection | Feb 13, 2022 | Rhythm | CodeCode Available | 1 |
| Detecting beats in the photoplethysmogram: benchmarking open-source algorithms | Jul 19, 2022 | BenchmarkingPhotoplethysmography (PPG) beat detection | CodeCode Available | 1 |
| Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks | Jul 6, 2017 | Arrhythmia DetectionElectrocardiography (ECG) | CodeCode Available | 1 |
| ECG Biometric Recognition: Review, System Proposal, and Benchmark Evaluation | Apr 8, 2022 | Rhythm | CodeCode Available | 1 |
| An Empirical Evaluation of End-to-End Polyphonic Optical Music Recognition | Aug 3, 2021 | Binary ClassificationDecoder | CodeCode Available | 1 |
| Anomaly Detection in Time Series with Triadic Motif Fields and Application in Atrial Fibrillation ECG Classification | Dec 9, 2020 | Anomaly DetectionAtrial Fibrillation Detection | CodeCode Available | 1 |
| Continuous Melody Generation via Disentangled Short-Term Representations and Structural Conditions | Feb 5, 2020 | DisentanglementMusic Generation | CodeCode Available | 1 |
| Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL | Apr 28, 2020 | AllBenchmarking | CodeCode Available | 1 |
| A holistic approach to polyphonic music transcription with neural networks | Oct 26, 2019 | Beat TrackingMusic Transcription | CodeCode Available | 1 |
| How Does it Sound? | Dec 1, 2021 | Rhythm | CodeCode Available | 1 |
| ImprovNet -- Generating Controllable Musical Improvisations with Iterative Corruption Refinement | Feb 6, 2025 | Music GenerationRhythm | CodeCode Available | 1 |
| A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification | Jun 12, 2024 | ECG ClassificationRhythm | CodeCode Available | 1 |