SOTAVerified

Chatbot

Chatbot or conversational AI is a language model designed and implemented to have conversations with humans.

Source: Open Data Chatbot

Image source

Papers

Showing 301350 of 971 papers

TitleStatusHype
Annotation alignment: Comparing LLM and human annotations of conversational safety0
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the WildCode3
Speech-based Clinical Depression Screening: An Empirical Study0
The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches0
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures0
Demo: Soccer Information Retrieval via Natural Queries using SoccerRAGCode0
Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study0
Inverse Constitutional AI: Compressing Preferences into PrinciplesCode1
Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee DiscussionsCode0
Phantom: General Trigger Attacks on Retrieval Augmented Language Generation0
Designing an Evaluation Framework for Large Language Models in Astronomy ResearchCode0
Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities0
ChatGPT as the Marketplace of Ideas: Should Truth-Seeking Be the Goal of AI Content Governance?0
Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership Growth0
DuanzAI: Slang-Enhanced LLM with Prompt for Humor UnderstandingCode0
Evaluation of the Programming Skills of Large Language Models0
SimPO: Simple Preference Optimization with a Reference-Free RewardCode4
Evaluating Large Language Models with Human Feedback: Establishing a Swedish BenchmarkCode0
From Human-to-Human to Human-to-Bot Conversations in Software Engineering0
Can AI Relate: Testing Large Language Model Response for Mental Health SupportCode0
Large Language Models Can Infer Personality from Free-Form User Interactions0
CPS-LLM: Large Language Model based Safe Usage Plan Generator for Human-in-the-Loop Human-in-the-Plant Cyber-Physical System0
SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks0
Tailoring Vaccine Messaging with Common-Ground OpinionsCode0
From Questions to Insightful Answers: Building an Informed Chatbot for University Resources0
RLHF Workflow: From Reward Modeling to Online RLHFCode5
Exploring the Potential of Conversational AI Support for Agent-Based Social Simulation Model Design0
Persona Inconstancy in Multi-Agent LLM Collaboration: Conformity, Confabulation, and ImpersonationCode0
MedDoc-Bot: A Chat Tool for Comparative Analysis of Large Language Models in the Context of the Pediatric Hypertension GuidelineCode0
MAmmoTH2: Scaling Instructions from the Web0
WildChat: 1M ChatGPT Interaction Logs in the Wild0
From Keyboard to Chatbot: An AI-powered Integration Platform with Large-Language Models for Teaching Computational Thinking for Young Children0
Lessons from the Use of Natural Language Inference (NLI) in Requirements Engineering Tasks0
Domain-Specific Improvement on Psychotherapy Chatbot Using Assistant0
Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice0
Using Adaptive Empathetic Responses for Teaching EnglishCode0
Incorporating Different Verbal Cues to Improve Text-Based Computer-Delivered Health Messaging0
MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering0
LuminLab: An AI-Powered Building Retrofit and Energy Modelling Platform0
Integrating Physiological Data with Large Language Models for Empathic Human-AI Interaction0
Deceptive Patterns of Intelligent and Interactive Writing Assistants0
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic EvaluatorsCode5
Physics Event Classification Using Large Language ModelsCode0
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues0
Token Trails: Navigating Contextual Depths in Conversational AI with ChatLLM0
Entertainment chatbot for the digital inclusion of elderly people without abstraction capabilities0
A Survey on Large Language Models from Concept to Implementation0
LARA: Linguistic-Adaptive Retrieval-Augmentation for Multi-Turn Intent Classification0
Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review0
Comprehensive Lipidomic Automation Workflow using Large Language Models0
Show:102550
← PrevPage 7 of 20Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Yi 34B ChatAverage win rate27.2Unverified