SOTAVerified

Response Generation

A task where an agent should play the $DE$ role and generate a text to respond to a $P$ message.

Papers

Showing 5175 of 914 papers

TitleStatusHype
LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant0
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation0
LLM-Safety Evaluations Lack Robustness0
LexRAG: Benchmarking Retrieval-Augmented Generation in Multi-Turn Legal Consultation ConversationCode1
ProAI: Proactive Multi-Agent Conversational AI with Structured Knowledge Base for Psychiatric Diagnosis0
The RAG Paradox: A Black-Box Attack Exploiting Unintentional Vulnerabilities in Retrieval-Augmented Generation Systems0
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding0
SS-MPC: A Sequence-Structured Multi-Party Conversation System0
Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance0
LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems0
PSCon: Product Search Through ConversationsCode0
On-Device LLMs for Home Assistant: Dual Role in Intent Detection and Response Generation0
Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation0
Efficient Response Generation Method Selection for Fine-Tuning Large Language Models0
DiSCo: Device-Server Collaborative LLM-Based Text Streaming Services0
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception0
MuDoC: An Interactive Multimodal Document-grounded Conversational AI System0
DiMA: An LLM-Powered Ride-Hailing Assistant at DiDi0
Grammar Control in Dialogue Response Generation for Language Learning ChatbotsCode0
Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language ModelsCode1
On Memory Construction and Retrieval for Personalized Conversational Agents0
MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers0
Vision-Integrated LLMs for Autonomous Driving Assistance : Human Performance Comparison and Trust Evaluation0
CAMI: A Counselor Agent Supporting Motivational Interviewing through State Inference and Topic Exploration0
A Video-grounded Dialogue Dataset and Metric for Event-driven ActivitiesCode0
Show:102550
← PrevPage 3 of 37Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PaCEBLEU34.1Unverified
2BART-largeBLEU33.1Unverified
3BART-baseBLEU29.4Unverified
4MTNBLEU21.7Unverified
5GPT-2BLEU19.2Unverified
#ModelMetricClaimedVerifiedStatus
1LED(Q,F)Message-F119.54Unverified
2LED(Q,P,H)Message-F116.14Unverified
3LED(Q,P)Message-F114.25Unverified
#ModelMetricClaimedVerifiedStatus
1PaCEBLEU22Unverified
2SimpleTODBLEU20.3Unverified