SOTAVerified

Rhythm

Papers

Showing 150 of 515 papers

TitleStatusHype
MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning BenchmarkCode7
OpenVoice: Versatile Instant Voice CloningCode7
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-PlayCode3
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style ControlCode3
Semantic Gesticulator: Semantics-Aware Co-Speech Gesture SynthesisCode3
FlashSpeech: Efficient Zero-Shot Speech SynthesisCode3
SongComposer: A Large Language Model for Lyric and Melody Generation in Song CompositionCode3
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture ModelingCode3
An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple DomainsCode2
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility EstimationCode2
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music GenerationCode2
AMUSE: Emotional Speech-driven 3D Body Animation via Disentangled Latent DiffusionCode2
Diff-BGM: A Diffusion Model for Video Background Music GenerationCode2
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space ModelsCode2
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent SystemsCode2
Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural EmbeddingsCode2
Unsupervised Speech Decomposition via Triple Information BottleneckCode2
ProtoECGNet: Case-Based Interpretable Deep Learning for Multi-Label ECG Classification with Contrastive LearningCode1
ECG-Expert-QA: A Benchmark for Evaluating Medical Large Language Models in Heart Disease DiagnosisCode1
Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language ModelCode1
ImprovNet -- Generating Controllable Musical Improvisations with Iterative Corruption RefinementCode1
A Multi-Resolution Mutual Learning Network for Multi-Label ECG ClassificationCode1
Singing Voice Graph Modeling for SingFake DetectionCode1
Perception-Inspired Graph Convolution for Music Understanding TasksCode1
SDEMG: Score-based Diffusion Model for Surface Electromyographic Signal DenoisingCode1
TSRNet: Simple Framework for Real-time ECG Anomaly Detection with Multimodal Time and Spectrogram Restoration NetworkCode1
Emotional Speech-driven 3D Body Animation via Disentangled Latent DiffusionCode1
Jam-ALT: A Formatting-Aware Lyrics Transcription BenchmarkCode1
Music ControlNet: A model similar to SD ControlNetD that can accurately control music generationCode1
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer modelCode1
MelodyGLM: Multi-task Pre-training for Symbolic Melody GenerationCode1
LivelySpeaker: Towards Semantic-Aware Co-Speech Gesture GenerationCode1
Multi-scale Cross-restoration Framework for Electrocardiogram Anomaly DetectionCode1
AesPA-Net: Aesthetic Pattern-Aware Style Transfer NetworksCode1
Rhythm Modeling for Voice ConversionCode1
Unsupervised Melody-to-Lyric GenerationCode1
EmotionGesture: Audio-Driven Diverse Emotional Co-Speech 3D Gesture GenerationCode1
QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture GenerationCode1
scPrisma infers, filters and enhances topological signals in single-cell data using spectral template matchingCode1
Speaking Style Conversion in the Waveform Domain Using Discrete Self-Supervised UnitsCode1
Self-Supervised PPG Representation Learning Shows High Inter-Subject VariabilityCode1
A unified one-shot prosody and speaker conversion system with self-supervised discrete speech unitsCode1
Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised LearningCode1
The ReprGesture entry to the GENEA Challenge 2022Code1
Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice ConversionCode1
Detecting beats in the photoplethysmogram: benchmarking open-source algorithmsCode1
TranSpeech: Speech-to-Speech Translation With Bilateral PerturbationCode1
Development of Interpretable Machine Learning Models to Detect Arrhythmia based on ECG DataCode1
ECG Biometric Recognition: Review, System Proposal, and Benchmark EvaluationCode1
IMLE-Net: An Interpretable Multi-level Multi-channel Model for ECG ClassificationCode1
Show:102550
← PrevPage 1 of 11Next →

No leaderboard results yet.