SOTAVerified|Agents Browse Leaderboard About

Chatbot

Chatbot or conversational AI is a language model designed and implemented to have conversations with humans.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 971 papers

Title	Date	Tasks	Status	Hype
AI vs. Human Judgment of Content Moderation: LLM-as-a-Judge and Ethics-Based Response Refusals	May 21, 2025	BenchmarkingChatbot	—Unverified	0
What is Stigma Attributed to? A Theory-Grounded, Expert-Annotated Interview Corpus for Demystifying Mental-Health Stigma	May 19, 2025	Chatbot	CodeCode Available	1
Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models	May 19, 2025	BenchmarkingChatbot	CodeCode Available	1
Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance	May 18, 2025	Chatbot	—Unverified	0
Let's have a chat with the EU AI Act	May 17, 2025	ChatbotRAG	—Unverified	0
GenAI Security: Outsmarting the Bots with a Proactive Testing Framework	May 14, 2025	Chatbot	—Unverified	0
WaLLM -- Insights from an LLM-Powered Chatbot deployment via WhatsApp	May 13, 2025	ChatbotNutrition	—Unverified	0
An empathic GPT-based chatbot to talk about mental disorders with Spanish teenagers	May 9, 2025	ChatbotLanguage Modeling	—Unverified	0
Large Language Models are often politically extreme, usually ideologically inconsistent, and persuasive even in informational contexts	May 7, 2025	ChatbotPersuasiveness	—Unverified	0
Steerable Chatbots: Personalizing LLMs with Preference-Based Activation Steering	May 7, 2025	Chatbot	—Unverified	0
A Proposal for Evaluating the Operational Risk for ChatBots based on Large Language Models	May 7, 2025	ChatbotCode Generation	—Unverified	0
LlamaFirewall: An open source guardrail system for building secure AI agents	May 6, 2025	Chatbot	—Unverified	0
Social Biases in Knowledge Representations of Wikidata separates Global North from Global South	May 5, 2025	AttributeChatbot	CodeCode Available	0
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis	May 5, 2025	ChatbotDecoder	CodeCode Available	3
Emotions in the Loop: A Survey of Affective Computing for Emotional Support	May 2, 2025	ChatbotEmotion Recognition	—Unverified	0
Enhancing ML Model Interpretability: Leveraging Fine-Tuned Large Language Models for Better Understanding of AI	May 2, 2025	Chatbot	—Unverified	0
The Leaderboard Illusion	Apr 29, 2025	BenchmarkingChatbot	—Unverified	0
Chatbot Arena Meets Nuggets: Towards Explanations and Diagnostics in the Evaluation of LLM Responses	Apr 28, 2025	ChatbotDiagnostic	—Unverified	0
AI Chatbots for Mental Health: Values and Harms from Lived Experiences of Depression	Apr 26, 2025	ChatbotManagement	—Unverified	0
Scaling Laws For Scalable Oversight	Apr 25, 2025	Chatbot	—Unverified	0
Cancer-Myth: Evaluating AI Chatbot on Patient Questions with False Presuppositions	Apr 15, 2025	Chatbot	CodeCode Available	0
CHARM: Calibrating Reward Models With Chatbot Arena Scores	Apr 14, 2025	Chatbot	CodeCode Available	1
Confirmation Bias in Generative AI Chatbots: Mechanisms, Risks, Mitigation Strategies, and Future Research Directions	Apr 12, 2025	Chatbot	—Unverified	0
Learning from Elders: Making an LLM-powered Chatbot for Retirement Communities more Accessible through User-centered Design	Apr 11, 2025	ChatbotInformation Retrieval	—Unverified	0
Data Requirement Goal Modeling for Machine Learning Systems	Apr 10, 2025	Chatbot	—Unverified	0

Show:10 25 50

← PrevPage 2 of 39Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Yi 34B Chat	Average win rate	27.2	—	Unverified