SOTAVerified

Large Language Model

Papers

Showing 13011350 of 6097 papers

TitleStatusHype
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global MemoryCode1
Glinthawk: A Two-Tiered Architecture for Offline LLM InferenceCode1
Multi-label Sequential Sentence Classification via Large Language ModelCode1
DMoERM: Recipes of Mixture-of-Experts for Effective Reward ModelingCode1
Global-Local Collaborative Inference with LLM for Lidar-Based Open-Vocabulary DetectionCode1
Ranked List Truncation for Large Language Model-based Re-RankingCode1
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For DrivingCode1
AstroAgents: A Multi-Agent AI for Hypothesis Generation from Mass Spectrometry DataCode1
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language ModelsCode1
CityBench: Evaluating the Capabilities of Large Language Models for Urban TasksCode1
Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical ReasoningCode1
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language ModelsCode1
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free AttentionCode1
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot ReasoningCode1
CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue CoreferenceCode1
CoS: Enhancing Personalization and Mitigating Bias with Context SteeringCode1
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level RoutingCode1
Citekit: A Modular Toolkit for Large Language Model Citation GenerationCode1
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language CorrectionsCode1
Explaining Relationships Between Scientific DocumentsCode1
CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical ResearcherCode1
Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language ModelCode1
Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse GradientsCode1
Dissecting Human and LLM PreferencesCode1
A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial OptimizationCode1
Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large ModelsCode1
ASSISTGUI: Task-Oriented Desktop Graphical User Interface AutomationCode1
MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual PropertyCode1
Motif: Intrinsic Motivation from Artificial Intelligence FeedbackCode1
Working Memory Capacity of ChatGPT: An Empirical StudyCode1
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language ModelCode1
Distributed LLMs and Multimodal Large Language Models: A Survey on Advances, Challenges, and Future DirectionsCode1
REFLECT: Summarizing Robot Experiences for Failure Explanation and CorrectionCode1
Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resourcesCode1
MoqaGPT : Zero-Shot Multi-modal Open-domain Question Answering with Large Language ModelCode1
CoVR-2: Automatic Data Construction for Composed Video RetrievalCode1
Hallucinations in Large Multilingual Translation ModelsCode1
CPLLM: Clinical Prediction with Large Language ModelsCode1
AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Ten Modalities via Language as a Reference FrameworkCode1
HARDMath: A Benchmark Dataset for Challenging Problems in Applied MathematicsCode1
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human PreferencesCode1
MSCPT: Few-shot Whole Slide Image Classification with Multi-scale and Context-focused Prompt TuningCode1
ChemMLLM: Chemical Multimodal Large Language ModelCode1
ChemLLM: A Chemical Large Language ModelCode1
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation LearningCode1
MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property PredictionCode1
CRAKEN: Cybersecurity LLM Agent with Knowledge-Based ExecutionCode1
Automatic Evaluation of Attribution by Large Language ModelsCode1
DesCo: Learning Object Recognition with Rich Language DescriptionsCode1
Detecting Hallucinations in Large Language Model Generation: A Token Probability ApproachCode1
Show:102550
← PrevPage 27 of 122Next →

No leaderboard results yet.