SOTAVerified

Spoken Dialogue Systems

Papers

Showing 150 of 254 papers

TitleStatusHype
WavChat: A Survey of Spoken Dialogue ModelsCode3
WavReward: Spoken Dialogue Models With Generalist Reward EvaluatorsCode2
Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language ModelCode1
"How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken ConversationsCode1
Plato Dialogue System: A Flexible Conversational AI Research PlatformCode1
Prompt-Guided Turn-Taking Prediction0
Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model0
Towards a Japanese Full-duplex Spoken Dialogue System0
Chain-of-Thought Training for Open E2E Spoken Dialogue Systems0
Speculative End-Turn Detector for Efficient Speech Chatbot Assistant0
ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems0
Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics0
LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems0
FlexDuo: A Pluggable System for Enabling Full-Duplex Capabilities in Speech Dialogue Systems0
Multimodal Transformer Models for Turn-taking Prediction: Effects on Conversational Dynamics of Human-Agent Interaction during Cooperative Gameplay0
An LLM Benchmark for Addressee Recognition in Multi-modal Multi-party Dialogue0
Real-Time Textless Dialogue GenerationCode0
OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios0
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage TrainingCode0
OmniFlatten: An End-to-end GPT Model for Seamless Voice ConversationCode0
Large Language Models Know What To Say But Not When To Speak0
Data Augmentation Integrating Dialogue Flow and Style to Adapt Spoken Dialogue Systems to Low-Resource User Groups0
SaSLaW: Dialogue Speech Corpus with Audio-visual Egocentric Information Toward Environment-adaptive Dialogue Speech SynthesisCode0
PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue SystemsCode0
Evaluation of a semi-autonomous attentive listening system with takeover prompting0
An Analysis of User Behaviors for Objectively Evaluating Spoken Dialogue Systems0
An Analysis of Dialogue Repair in Voice Assistants0
Dialogue Systems Can Generate Appropriate Responses without the Use of Question Marks? -- Investigation of the Effects of Question Marks on Dialogue Systems0
Unified Conversational Models with System-Initiated Transitions between Chit-Chat and Task-Oriented Dialogues0
OLISIA: a Cascade System for Spoken Dialogue State TrackingCode0
What Types of Questions Require Conversation to Answer? A Case Study of AskReddit Questions0
Transformers in Speech Processing: A Survey0
Analysis and Utilization of Entrainment on Acoustic and Emotion Features in User-agent Dialogue0
A Transformer-Based User Satisfaction Prediction for Proactive Interaction Mechanism in DuerOS0
Interactivism in Spoken Dialogue Systems0
Simultaneous Job Interview System Using Multiple Semi-autonomous Agents0
Using Transition Duration to Improve Turn-taking in Conversational Agents0
Symbol and Communicative Grounding through Object Permanence with a Mobile Robot0
When can I Speak? Predicting initiation points for spoken dialogue agentsCode0
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History0
Towards Speech-only Opinion-level Sentiment Analysis0
Understanding How People Rate Their Conversations0
Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems0
NLU for Game-based Learning in Real: Initial Evaluations0
Data Augmentation with Paraphrase Generation and Entity Extraction for Multimodal Dialogue System0
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and IdentificationCode0
Gated Multimodal Fusion with Contrastive Learning for Turn-taking Prediction in Human-robot Dialogue0
Dialogue Strategy Adaptation to New Action Sets Using Multi-dimensional Modelling0
A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals0
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification0
Show:102550
← PrevPage 1 of 6Next →

No leaderboard results yet.