SOTAVerified

Large Language Model

Papers

Showing 251300 of 6097 papers

TitleStatusHype
SALMONN: Towards Generic Hearing Abilities for Large Language ModelsCode3
Llemma: An Open Language Model For MathematicsCode3
OceanGPT: A Large Language Model for Ocean Science TasksCode3
WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human PreferencesCode3
How Can Recommender Systems Benefit from Large Language Models: A SurveyCode3
HuatuoGPT, towards Taming Language Model to Be a DoctorCode3
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on WikipediaCode3
Hierarchical Prompting Assists Large Language Model on Web NavigationCode3
RecurrentGPT: Interactive Generation of (Arbitrarily) Long TextCode3
SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational AbilitiesCode3
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and VerificationCode3
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign LanguagesCode3
ThoughtSource: A central hub for large language model reasoning dataCode3
Seq vs Seq: An Open Suite of Paired Encoders and DecodersCode2
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil EngineeringCode2
Open Source Planning & Control System with Language Agents for Autonomous Scientific DiscoveryCode2
HumanOmniV2: From Understanding to Omni-Modal Reasoning with ContextCode2
Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics LearningCode2
Pre-Trained LLM is a Semantic-Aware and Generalizable Segmentation BoosterCode2
SonicVerse: Multi-Task Learning for Music Feature-Informed CaptioningCode2
video-SALMONN 2: Captioning-Enhanced Audio-Visual Large Language ModelsCode2
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security TasksCode2
AutoMind: Adaptive Knowledgeable Agent for Automated Data ScienceCode2
Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest QuestionsCode2
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at ScaleCode2
Reasoning-Table: Exploring Reinforcement Learning for Table ReasoningCode2
Compiler Optimization via LLM Reasoning for Efficient Model ServingCode2
FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual FusionCode2
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language ModelsCode2
ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning EngineeringCode2
cadrille: Multi-modal CAD Reconstruction with Online Reinforcement LearningCode2
Zero-Shot Vision Encoder Grafting via LLM SurrogatesCode2
LLaMEA-BO: A Large Language Model Evolutionary Algorithm for Automatically Generating Bayesian Optimization AlgorithmsCode2
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model InferenceCode2
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel DecodingCode2
Web-Shepherd: Advancing PRMs for Reinforcing Web AgentsCode2
CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive ProgrammingCode2
LifelongAgentBench: Evaluating LLM Agents as Lifelong LearnersCode2
Demystifying and Enhancing the Efficiency of Large Language Model Based Search AgentsCode2
Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and EnhancementCode2
YuLan-OneSim: Towards the Next Generation of Social Simulator with Large Language ModelsCode2
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning EngineeringCode2
DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented GenerationCode2
GuidedQuant: Large Language Model Quantization via Exploiting End Loss GuidanceCode2
Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D GenerationCode2
MemEngine: A Unified and Modular Library for Developing Advanced Memory of LLM-based AgentsCode2
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single TransformerCode2
ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language modelCode2
SegEarth-R1: Geospatial Pixel Reasoning via Large Language ModelCode2
GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video SegmentationCode2
Show:102550
← PrevPage 6 of 122Next →

No leaderboard results yet.