Rhythm

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 515 papers

Title	Date	Tasks	Status	Hype
MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark	Jun 5, 2025	RhythmSpoken Language Understanding	CodeCode Available	7
OpenVoice: Versatile Instant Voice Cloning	Dec 3, 2023	RhythmVoice Cloning	CodeCode Available	7
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play	May 5, 2025	AI AgentAutomatic Speech Recognition	CodeCode Available	3
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control	Sep 24, 2024	ClusteringLanguage Modelling	CodeCode Available	3
Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis	May 16, 2024	Language ModellingLarge Language Model	CodeCode Available	3
FlashSpeech: Efficient Zero-Shot Speech Synthesis	Apr 23, 2024	RhythmSpeech Synthesis	CodeCode Available	3
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition	Feb 27, 2024	Instruction FollowingLanguage Modeling	CodeCode Available	3
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling	Dec 31, 2023	3D Face AnimationDiversity	CodeCode Available	3
An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains	Oct 5, 2024	DiagnosticEvent Detection	CodeCode Available	2
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation	Aug 5, 2024	RhythmSelf-Supervised Learning	CodeCode Available	2
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation	Jul 21, 2024	DiversityMusic Generation	CodeCode Available	2
AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion	Jun 1, 2024	Gesture GenerationRhythm	CodeCode Available	2
Diff-BGM: A Diffusion Model for Video Background Music Generation	May 20, 2024	DiversityMusic Generation	CodeCode Available	2
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models	Mar 14, 2024	3D Face AnimationDiversity	CodeCode Available	2
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems	Jan 8, 2024	Language ModellingLarge Language Model	CodeCode Available	2
Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings	Oct 4, 2022	Gesture GenerationRhythm	CodeCode Available	2
Unsupervised Speech Decomposition via Triple Information Bottleneck	Apr 23, 2020	RhythmStyle Transfer	CodeCode Available	2
ProtoECGNet: Case-Based Interpretable Deep Learning for Multi-Label ECG Classification with Contrastive Learning	Apr 11, 2025	Contrastive LearningDeep Learning	CodeCode Available	1
ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease Diagnosis	Feb 16, 2025	DiagnosticRhythm	CodeCode Available	1
Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model	Feb 15, 2025	Language ModelingLanguage Modelling	CodeCode Available	1
ImprovNet -- Generating Controllable Musical Improvisations with Iterative Corruption Refinement	Feb 6, 2025	Music GenerationRhythm	CodeCode Available	1
A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification	Jun 12, 2024	ECG ClassificationRhythm	CodeCode Available	1
Singing Voice Graph Modeling for SingFake Detection	Jun 5, 2024	DeepFake DetectionFace Swapping	CodeCode Available	1
Perception-Inspired Graph Convolution for Music Understanding Tasks	May 15, 2024	Graph ClassificationGraph Learning	CodeCode Available	1
SDEMG: Score-based Diffusion Model for Surface Electromyographic Signal Denoising	Feb 6, 2024	DenoisingRhythm	CodeCode Available	1
TSRNet: Simple Framework for Real-time ECG Anomaly Detection with Multimodal Time and Spectrogram Restoration Network	Dec 15, 2023	Anomaly DetectionRhythm	CodeCode Available	1
Emotional Speech-driven 3D Body Animation via Disentangled Latent Diffusion	Dec 7, 2023	Gesture GenerationRhythm	CodeCode Available	1
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark	Nov 23, 2023	Automatic Lyrics TranscriptionRhythm	CodeCode Available	1
Music ControlNet: A model similar to SD ControlNetD that can accurately control music generation	Nov 7, 2023	Music GenerationRhythm	CodeCode Available	1
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model	Nov 2, 2023	Music GenerationRhythm	CodeCode Available	1
MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation	Sep 19, 2023	Rhythm	CodeCode Available	1
LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture Generation	Sep 17, 2023	Gesture GenerationRhythm	CodeCode Available	1
Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly Detection	Aug 3, 2023	Anomaly DetectionDiagnostic	CodeCode Available	1
AesPA-Net: Aesthetic Pattern-Aware Style Transfer Networks	Jul 19, 2023	RhythmSemantic correspondence	CodeCode Available	1
Rhythm Modeling for Voice Conversion	Jul 12, 2023	RhythmVoice Conversion	CodeCode Available	1
Unsupervised Melody-to-Lyric Generation	May 30, 2023	DisentanglementRhythm	CodeCode Available	1
EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture Generation	May 30, 2023	Gesture GenerationRhythm	CodeCode Available	1
QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation	May 18, 2023	Gesture GenerationQuantization	CodeCode Available	1
scPrisma infers, filters and enhances topological signals in single-cell data using spectral template matching	Feb 27, 2023	RhythmTemplate Matching	CodeCode Available	1
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised Units	Dec 19, 2022	RhythmVoice Conversion	CodeCode Available	1
Self-Supervised PPG Representation Learning Shows High Inter-Subject Variability	Dec 7, 2022	Activity RecognitionRepresentation Learning	CodeCode Available	1
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units	Nov 12, 2022	RhythmVoice Conversion	CodeCode Available	1
Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning	Sep 30, 2022	ECG ClassificationKnowledge Distillation	CodeCode Available	1
The ReprGesture entry to the GENEA Challenge 2022	Aug 25, 2022	DecoderGesture Generation	CodeCode Available	1
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion	Aug 18, 2022	DisentanglementRhythm	CodeCode Available	1
Detecting beats in the photoplethysmogram: benchmarking open-source algorithms	Jul 19, 2022	BenchmarkingPhotoplethysmography (PPG) beat detection	CodeCode Available	1
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation	May 25, 2022	Representation LearningRhythm	CodeCode Available	1
Development of Interpretable Machine Learning Models to Detect Arrhythmia based on ECG Data	May 5, 2022	BIG-bench Machine LearningFeature Importance	CodeCode Available	1
ECG Biometric Recognition: Review, System Proposal, and Benchmark Evaluation	Apr 8, 2022	Rhythm	CodeCode Available	1
IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG Classification	Apr 6, 2022	ECG ClassificationRhythm	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 11Next →

No leaderboard results yet.