Chatbot

Chatbot or conversational AI is a language model designed and implemented to have conversations with humans.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 971 papers

Title	Date	Tasks	Status	Hype
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference	Mar 7, 2024	Chatbot	CodeCode Available	14
Yi: Open Foundation Models by 01.AI	Mar 7, 2024	AttributeChatbot	CodeCode Available	9
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot	Dec 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	7
Scaling Speech-Text Pre-training with Synthetic Interleaved Data	Nov 26, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	7
DeepSeek-VL: Towards Real-World Vision-Language Understanding	Mar 8, 2024	ChatbotLanguage Modelling	CodeCode Available	7
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset	Sep 21, 2023	ChatbotDiversity	CodeCode Available	7
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena	Jun 9, 2023	ChatbotLanguage Modelling	CodeCode Available	7
Mistral 7B	Oct 10, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
h2oGPT: Democratizing Large Language Models	Jun 13, 2023	ChatbotFairness	CodeCode Available	6
QLoRA: Efficient Finetuning of Quantized LLMs	May 23, 2023	ChatbotGPU	CodeCode Available	6
Jamba-1.5: Hybrid Transformer-Mamba Models at Scale	Aug 22, 2024	ChatbotInstruction Following	CodeCode Available	5
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline	Jun 17, 2024	Chatbot	CodeCode Available	5
RLHF Workflow: From Reward Modeling to Online RLHF	May 13, 2024	ChatbotHumanEval	CodeCode Available	5
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators	Apr 6, 2024	Chatbotcounterfactual	CodeCode Available	5
SimPO: Simple Preference Optimization with a Reference-Free Reward	May 23, 2024	ChatbotInstruction Following	CodeCode Available	4
Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data	Apr 3, 2023	ChatbotLanguage Modeling	CodeCode Available	4
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis	May 5, 2025	ChatbotDecoder	CodeCode Available	3
Prompt-to-Leaderboard	Feb 20, 2025	ChatbotLanguage Modeling	CodeCode Available	3
ELIZA Reanimated: The world's first chatbot restored on the world's first time sharing system	Jan 12, 2025	Chatbot	CodeCode Available	3
Improving Model Evaluation using SMART Filtering of Benchmark Datasets	Oct 26, 2024	ChatbotDiversity	CodeCode Available	3
PsyDI: Towards a Personalized and Progressively In-depth Chatbot for Psychological Measurements	Jul 22, 2024	Chatbot	CodeCode Available	3
Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks	Jun 12, 2024	BenchmarkingChatbot	CodeCode Available	3
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild	Jun 7, 2024	BenchmarkingChatbot	CodeCode Available	3
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia	May 23, 2023	ChatbotHallucination	CodeCode Available	3
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development	May 22, 2025	Bug fixingChatbot	CodeCode Available	2
Language Model Powered Digital Biology with BRAD	Sep 4, 2024	ChatbotCode Generation	CodeCode Available	2
Efficient LLM Scheduling by Learning to Rank	Aug 28, 2024	BlockingChatbot	CodeCode Available	2
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models	Jun 26, 2024	ChatbotRed Teaming	CodeCode Available	2
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction	Feb 28, 2024	ChatbotReconstruction Attack	CodeCode Available	2
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding	Feb 14, 2024	ChatbotCode Generation	CodeCode Available	2
LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation	Dec 28, 2023	Answer GenerationChatbot	CodeCode Available	2
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts	Oct 3, 2023	ChatbotImage Captioning	CodeCode Available	2
EduChat: A Large-Scale Language Model-based Chatbot System for Intelligent Education	Aug 5, 2023	ChatbotLanguage Modeling	CodeCode Available	2
Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models	May 24, 2023	ChatbotNatural Language Understanding	CodeCode Available	2
MemoryBank: Enhancing Large Language Models with Long-Term Memory	May 17, 2023	Chatbot	CodeCode Available	2
SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support	Apr 30, 2023	Chatbot	CodeCode Available	2
Ten Quick Tips for Harnessing the Power of ChatGPT/GPT-4 in Computational Biology	Mar 29, 2023	ChatbotPrompt Engineering	CodeCode Available	2
CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline Reinforcement Learning	Apr 18, 2022	ChatbotOffline RL	CodeCode Available	2
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training	Mar 17, 2022	Chatbot	CodeCode Available	2
Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models	May 19, 2025	BenchmarkingChatbot	CodeCode Available	1
What is Stigma Attributed to? A Theory-Grounded, Expert-Annotated Interview Corpus for Demystifying Mental-Health Stigma	May 19, 2025	Chatbot	CodeCode Available	1
CHARM: Calibrating Reward Models With Chatbot Arena Scores	Apr 14, 2025	Chatbot	CodeCode Available	1
EvalTree: Profiling Language Model Weaknesses via Hierarchical Capability Trees	Mar 11, 2025	ChatbotLanguage Modeling	CodeCode Available	1
Knowledge Graph-Driven Retrieval-Augmented Generation: Integrating Deepseek-R1 with Weaviate for Advanced Chatbot Applications	Feb 16, 2025	ChatbotLanguage Modeling	CodeCode Available	1
Improving Your Model Ranking on Chatbot Arena by Vote Rigging	Jan 29, 2025	Chatbot	CodeCode Available	1
MiniGPT-Pancreas: Multimodal Large Language Model for Pancreas Cancer Classification and Detection	Dec 20, 2024	Cancer ClassificationChatbot	CodeCode Available	1
TransitGPT: A Generative AI-based framework for interacting with GTFS data using Large Language Models	Dec 7, 2024	ChatbotNatural Language Queries	CodeCode Available	1
Learning to Assist Humans without Inferring Rewards	Nov 4, 2024	Chatbotreinforcement-learning	CodeCode Available	1
Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents	Oct 11, 2024	ChatbotRed Teaming	CodeCode Available	1
A Recipe For Building a Compliant Real Estate Chatbot	Oct 7, 2024	ChatbotInstruction Following	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 20Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Yi 34B Chat	Average win rate	27.2	—	Unverified