SOTAVerified

Large Language Model

Papers

Showing 11011150 of 6097 papers

TitleStatusHype
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent CollaborationCode1
AttributionBench: How Hard is Automatic Attribution Evaluation?Code1
Meerkat: Audio-Visual Large Language Model for Grounding in Space and TimeCode1
Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEditCode1
RARR: Researching and Revising What Language Models Say, Using Language ModelsCode1
AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric KnowledgeCode1
DualAD: Dual-Layer Planning for Reasoning in Autonomous DrivingCode1
A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment RecommendationCode1
DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan DesignCode1
MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and DiagnosisCode1
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU HeterogeneityCode1
Measuring General Intelligence with Generated GamesCode1
DrugAssist: A Large Language Model for Molecule OptimizationCode1
Meaning Typed Prompting: A Technique for Efficient, Reliable Structured Output GenerationCode1
MechAgents: Large language model multi-agent collaborations can solve mechanics problems, generate new data, and integrate knowledgeCode1
DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace EditingCode1
DRG-LLaMA : Tuning LLaMA Model to Predict Diagnosis-related Group for Hospitalized PatientsCode1
MedFILIP: Medical Fine-grained Language-Image Pre-trainingCode1
MELLM: Exploring LLM-Powered Micro-Expression Understanding Enhanced by Subtle Motion PerceptionCode1
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt EngineerCode1
DOMINO: A Dual-System for Multi-step Visual Language ReasoningCode1
Matching Patients to Clinical Trials with Large Language ModelsCode1
Do Large Language Model Benchmarks Test Reliability?Code1
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For DrivingCode1
Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-RankingCode1
Making Language Models Better Tool Learners with Execution FeedbackCode1
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning ProblemsCode1
MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and CollaborationCode1
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language ModelsCode1
MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQLCode1
DMoERM: Recipes of Mixture-of-Experts for Effective Reward ModelingCode1
A Survey on Self-Supervised Graph Foundation Models: Knowledge-Based PerspectiveCode1
An In-Context Learning Agent for Formal Theorem-ProvingCode1
Dynamic Updates for Language Adaptation in Visual-Language TrackingCode1
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward PassesCode1
Lshan-1.0 Technical ReportCode1
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language CorrectionsCode1
LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal DataCode1
Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel DecodingCode1
Dissecting Human and LLM PreferencesCode1
Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future DirectionsCode1
LRSCLIP: A Vision-Language Foundation Model for Aligning Remote Sensing Image with Longer TextCode1
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPTCode1
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language ModelCode1
A Cross-Modal Approach to Silent Speech with LLM-Enhanced RecognitionCode1
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative EditingCode1
Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract Scene DescriptionsCode1
Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language ModelCode1
CoditT5: Pretraining for Source Code and Natural Language EditingCode1
Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resourcesCode1
Show:102550
← PrevPage 23 of 122Next →

No leaderboard results yet.