SOTAVerified|Agents Browse Leaderboard About Blog

Chatbot

Chatbot or conversational AI is a language model designed and implemented to have conversations with humans.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 971 papers

Title	Date	Tasks	Status	Hype
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference	Mar 7, 2024	Chatbot	CodeCode Available	14
Yi: Open Foundation Models by 01.AI	Mar 7, 2024	AttributeChatbot	CodeCode Available	9
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot	Dec 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	7
Scaling Speech-Text Pre-training with Synthetic Interleaved Data	Nov 26, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	7
DeepSeek-VL: Towards Real-World Vision-Language Understanding	Mar 8, 2024	ChatbotLanguage Modelling	CodeCode Available	7
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset	Sep 21, 2023	ChatbotDiversity	CodeCode Available	7
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena	Jun 9, 2023	ChatbotLanguage Modelling	CodeCode Available	7
Mistral 7B	Oct 10, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
h2oGPT: Democratizing Large Language Models	Jun 13, 2023	ChatbotFairness	CodeCode Available	6
QLoRA: Efficient Finetuning of Quantized LLMs	May 23, 2023	ChatbotGPU	CodeCode Available	6
Jamba-1.5: Hybrid Transformer-Mamba Models at Scale	Aug 22, 2024	ChatbotInstruction Following	CodeCode Available	5
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline	Jun 17, 2024	Chatbot	CodeCode Available	5
RLHF Workflow: From Reward Modeling to Online RLHF	May 13, 2024	ChatbotHumanEval	CodeCode Available	5
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators	Apr 6, 2024	Chatbotcounterfactual	CodeCode Available	5
SimPO: Simple Preference Optimization with a Reference-Free Reward	May 23, 2024	ChatbotInstruction Following	CodeCode Available	4
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data	Apr 3, 2023	ChatbotLanguage Modeling	CodeCode Available	4
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis	May 5, 2025	ChatbotDecoder	CodeCode Available	3
Prompt-to-Leaderboard	Feb 20, 2025	ChatbotLanguage Modeling	CodeCode Available	3
ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing system	Jan 12, 2025	Chatbot	CodeCode Available	3
Improving Model Evaluation using SMART Filtering of Benchmark Datasets	Oct 26, 2024	ChatbotDiversity	CodeCode Available	3
PsyDI: Towards a Personalized and Progressively In-depth Chatbot for Psychological Measurements	Jul 22, 2024	Chatbot	CodeCode Available	3
Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks	Jun 12, 2024	BenchmarkingChatbot	CodeCode Available	3
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild	Jun 7, 2024	BenchmarkingChatbot	CodeCode Available	3
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia	May 23, 2023	ChatbotHallucination	CodeCode Available	3
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development	May 22, 2025	Bug fixingChatbot	CodeCode Available	2

Show:10 25 50

← PrevPage 1 of 39Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Yi 34B Chat	Average win rate	27.2	—	Unverified