SOTAVerified

Large Language Model

Papers

Showing 251275 of 6097 papers

TitleStatusHype
SALMONN: Towards Generic Hearing Abilities for Large Language ModelsCode3
Llemma: An Open Language Model For MathematicsCode3
OceanGPT: A Large Language Model for Ocean Science TasksCode3
WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human PreferencesCode3
How Can Recommender Systems Benefit from Large Language Models: A SurveyCode3
HuatuoGPT, towards Taming Language Model to Be a DoctorCode3
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on WikipediaCode3
Hierarchical Prompting Assists Large Language Model on Web NavigationCode3
RecurrentGPT: Interactive Generation of (Arbitrarily) Long TextCode3
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational AbilitiesCode3
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and VerificationCode3
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign LanguagesCode3
ThoughtSource: A central hub for large language model reasoning dataCode3
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil EngineeringCode2
Seq vs Seq: An Open Suite of Paired Encoders and DecodersCode2
Open Source Planning & Control System with Language Agents for Autonomous Scientific DiscoveryCode2
HumanOmniV2: From Understanding to Omni-Modal Reasoning with ContextCode2
Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics LearningCode2
Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation BoosterCode2
video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language ModelsCode2
SonicVerse: Multi-Task Learning for Music Feature-Informed CaptioningCode2
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security TasksCode2
AutoMind: Adaptive Knowledgeable Agent for Automated Data ScienceCode2
Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest QuestionsCode2
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at ScaleCode2
Show:102550
← PrevPage 11 of 244Next →

No leaderboard results yet.