SOTAVerified

Large Language Model

Papers

Showing 601650 of 6097 papers

TitleStatusHype
Lion: Adversarial Distillation of Proprietary Large Language ModelsCode2
Listen, Think, and UnderstandCode2
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language ModelCode2
StructGPT: A General Framework for Large Language Model to Reason over Structured DataCode2
Large Language Model Guided Tree-of-ThoughtCode2
TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with RecommendationCode2
RRHF: Rank Responses to Align Language Models with Human Feedback without tearsCode2
Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPTCode2
Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Dataset Augmented by ChatGPTCode2
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal ResearchCode2
Language Models can Solve Computer TasksCode2
Large Language Model Instruction Following: A Survey of Progresses and ChallengesCode2
PaLM-E: An Embodied Multimodal Language ModelCode2
OpenICL: An Open-Source Framework for In-context LearningCode2
Prophet: Prompting Large Language Models with Complementary Answer Heuristics for Knowledge-based Visual Question AnsweringCode2
Reward Design with Language ModelsCode2
An Empirical Evaluation of Using Large Language Models for Automated Unit Test GenerationCode2
Accelerating Large Language Model Decoding with Speculative SamplingCode2
Muse: Text-To-Image Generation via Masked Generative TransformersCode2
SODA: Million-scale Dialogue Distillation with Social Commonsense ContextualizationCode2
Parsel: Algorithmic Reasoning with Language Models by Composing DecompositionsCode2
TabLLM: Few-shot Classification of Tabular Data with Large Language ModelsCode2
Generate rather than Retrieve: Large Language Models are Strong Context GeneratorsCode2
Solving Quantitative Reasoning Problems with Language ModelsCode2
Learning to Tune Like an Expert: Interpretable and Scene-Aware Navigation via MLLM Reasoning and CVAE-Based AdaptationCode1
LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMsCode1
Dataset Distillation via Vision-Language Category PrototypeCode1
Where, What, Why: Towards Explainable Driver Attention PredictionCode1
Decoupled Seg Tokens Make Stronger Reasoning Video Segmenter and GrounderCode1
GPTailor: Large Language Model Pruning Through Layer Cutting and StitchingCode1
MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and DiagnosisCode1
Evolving Prompts In-Context: An Open-ended, Self-replicating PerspectiveCode1
DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For DrivingCode1
The Condition Number as a Scale-Invariant Proxy for Information Encoding in Neural UnitsCode1
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language ModelsCode1
LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling ResearchCode1
RMIT-ADM+S at the SIGIR 2025 LiveRAG ChallengeCode1
TagRouter: Learning Route to LLMs through Tags for Open-Domain Text Generation TasksCode1
A Benchmark for Generalizing Across Diverse Team Strategies in Competitive PokémonCode1
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM ReasoningCode1
Adapting Vision-Language Foundation Model for Next Generation Medical Ultrasound Image AnalysisCode1
EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial StatementsCode1
Eigenspectrum Analysis of Neural Networks without Aspect Ratio BiasCode1
DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference AccelerationCode1
Agentomics-ML: Autonomous Machine Learning Experimentation Agent for Genomic and Transcriptomic DataCode1
OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language ModelCode1
POSS: Position Specialist Generates Better Draft for Speculative DecodingCode1
RewardAnything: Generalizable Principle-Following Reward ModelsCode1
DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity EnvironmentsCode1
Period-LLM: Extending the Periodic Capability of Multimodal Large Language ModelCode1
Show:102550
← PrevPage 13 of 122Next →

No leaderboard results yet.