SOTAVerified

Chatbot

Chatbot or conversational AI is a language model designed and implemented to have conversations with humans.

Source: Open Data Chatbot

Image source

Papers

Showing 301325 of 971 papers

TitleStatusHype
Annotation alignment: Comparing LLM and human annotations of conversational safety0
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the WildCode3
Speech-based Clinical Depression Screening: An Empirical Study0
The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches0
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures0
Demo: Soccer Information Retrieval via Natural Queries using SoccerRAGCode0
Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study0
Inverse Constitutional AI: Compressing Preferences into PrinciplesCode1
Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee DiscussionsCode0
Phantom: General Trigger Attacks on Retrieval Augmented Language Generation0
Designing an Evaluation Framework for Large Language Models in Astronomy ResearchCode0
Automatic detection of cognitive impairment in elderly people using an entertainment chatbot with Natural Language Processing capabilities0
ChatGPT as the Marketplace of Ideas: Should Truth-Seeking Be the Goal of AI Content Governance?0
Coaching Copilot: Blended Form of an LLM-Powered Chatbot and a Human Coach to Effectively Support Self-Reflection for Leadership Growth0
DuanzAI: Slang-Enhanced LLM with Prompt for Humor UnderstandingCode0
Evaluation of the Programming Skills of Large Language Models0
SimPO: Simple Preference Optimization with a Reference-Free RewardCode4
Evaluating Large Language Models with Human Feedback: Establishing a Swedish BenchmarkCode0
From Human-to-Human to Human-to-Bot Conversations in Software Engineering0
Can AI Relate: Testing Large Language Model Response for Mental Health SupportCode0
Large Language Models Can Infer Personality from Free-Form User Interactions0
CPS-LLM: Large Language Model based Safe Usage Plan Generator for Human-in-the-Loop Human-in-the-Plant Cyber-Physical System0
SynDy: Synthetic Dynamic Dataset Generation Framework for Misinformation Tasks0
Tailoring Vaccine Messaging with Common-Ground OpinionsCode0
From Questions to Insightful Answers: Building an Informed Chatbot for University Resources0
Show:102550
← PrevPage 13 of 39Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Yi 34B ChatAverage win rate27.2Unverified