SOTAVerified

Dialogue Generation

Dialogue generation is the task of "understanding" natural language inputs - within natural language processing in order to produce output. The systems are usually intended for conversing with humans, for instance back and forth dialogue with a conversation agent like a chatbot. Some example benchmarks for this task (see others such as Natural Language Understanding) include FusedChat and Ubuntu DIalogue Corpus (UDC). Models can be evaluated via metrics such as BLEU, ROUGE, and METEOR albeit with challenges in terms of weak correlation with human judgement, that may be addressed by new ones like UnSupervised and Reference-free (USR) and Metric for automatic Unreferenced dialog evaluation (MaUde).

Papers

Showing 2650 of 606 papers

TitleStatusHype
SafeDialBench: A Fine-Grained Safety Benchmark for Large Language Models in Multi-Turn Dialogues with Diverse Jailbreak AttacksCode1
PsyPlay: Personality-Infused Role-Playing Conversational Agents0
Conversation AI Dialog for Medicare powered by Finetuning and Retrieval Augmented Generation0
TV-Dialogue: Crafting Theme-Aware Video Dialogues with Immersive Interaction0
Advancing Multi-Party Dialogue Framework with Speaker-ware Contrastive Learning0
Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data0
Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMsCode0
BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response GenerationCode0
Real-Time Textless Dialogue GenerationCode0
SLIDE: Integrating Speech Language Model with LLM for Spontaneous Spoken Dialogue GenerationCode0
STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling0
Multi-Party Supervised Fine-tuning of Language Models for Multi-Party Dialogue Generation0
DEMO: Reframing Dialogue Interaction with Fine-grained Element ModelingCode1
Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification0
Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey0
Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language ModelsCode0
A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation0
A Stack-Propagation Framework for Low-Resource Personalized Dialogue Generation0
Policy-driven Knowledge Selection and Response Generation for Document-grounded Dialogue0
An Automatic and Cost-Efficient Peer-Review Framework for Language Generation Evaluation0
Listening to Patients: A Framework of Detecting and Mitigating Patient Misreport for Medical Dialogue Generation0
PersoBench: Benchmarking Personalized Response Generation in Large Language ModelsCode0
MA-RLHF: Reinforcement Learning from Human Feedback with Macro ActionsCode1
A Two-Stage Proactive Dialogue Generator for Efficient Clinical Information Collection Using Large Language Model0
DiaSynth: Synthetic Dialogue Generation Framework for Low Resource Dialogue Applications0
Show:102550
← PrevPage 2 of 25Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1LMEDRAvg F121.99Unverified
2P^2 BotAvg F119.77Unverified
3TransferTransfoAvg F119.09Unverified
4Seq2Seq + AttentionAvg F116.18Unverified
5Synthesizer (R+V)BLEU-114.7Unverified
6KV Profile MemoryAvg F111.9Unverified
#ModelMetricClaimedVerifiedStatus
1Classification-based modelSlot Accuracy0.97Unverified
2Two-in-one modelSlot Accuracy0.97Unverified
#ModelMetricClaimedVerifiedStatus
1EVAmauve0.97Unverified
2Per-BOBmauve0.95Unverified
#ModelMetricClaimedVerifiedStatus
1mm1 in 10 R@25Unverified
#ModelMetricClaimedVerifiedStatus
1∞-former (Sticky memories)F19.01Unverified
#ModelMetricClaimedVerifiedStatus
1∞-former (Sticky memories + initialized GPT-2 Small)Perplexity32.48Unverified
#ModelMetricClaimedVerifiedStatus
1SpaceFusioninterest (human)2.53Unverified
#ModelMetricClaimedVerifiedStatus
1MrRNN Act.-Ent.F14.63Unverified
#ModelMetricClaimedVerifiedStatus
1MrRNN Act.-Ent.Accuracy34.48Unverified
#ModelMetricClaimedVerifiedStatus
1MrRNN Act.-Ent.F111.43Unverified
#ModelMetricClaimedVerifiedStatus
1MrRNN Act.-Ent.Accuracy95.04Unverified
#ModelMetricClaimedVerifiedStatus
1MrRNN Act.-Ent.F13.72Unverified
#ModelMetricClaimedVerifiedStatus
1MrRNN Act.-Ent.Accuracy29.01Unverified