SOTAVerified

Language Modeling

Papers

Showing 401450 of 14182 papers

TitleStatusHype
Compact Language Models via Pruning and Knowledge DistillationCode3
Conformer: Convolution-augmented Transformer for Speech RecognitionCode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
Large Language Model-Brained GUI Agents: A SurveyCode3
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient InferenceCode3
GiT: Towards Generalist Vision Transformer through Universal Language InterfaceCode3
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 TrainingCode3
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language ModelCode3
GNN-RAG: Graph Neural Retrieval for Large Language Model ReasoningCode3
Audio-Reasoner: Improving Reasoning Capability in Large Audio Language ModelsCode3
APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model PromptsCode3
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality DataCode3
Cleaner Pretraining Corpus Curation with Neural Web ScrapingCode3
A Comprehensive Survey on Long Context Language ModelingCode3
Language Models are Few-Shot LearnersCode3
Discovering Language Model Behaviors with Model-Written EvaluationsCode3
From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based AgentsCode3
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model AgentsCode3
Language Model InversionCode3
Large Language Model based Long-tail Query Rewriting in Taobao SearchCode3
LaViDa: A Large Diffusion Language Model for Multimodal UnderstandingCode3
M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language ModelsCode3
Ola: Pushing the Frontiers of Omni-Modal Language ModelCode3
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning TasksCode3
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image GenerationCode3
LaMI-DETR: Open-Vocabulary Detection with Language Model InstructionCode2
A Systematic Survey of Prompt Engineering on Vision-Language Foundation ModelsCode2
LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity EnhancementCode2
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM InferenceCode2
Asynchronous Large Language Model Enhanced Planner for Autonomous DrivingCode2
KV Shifting Attention Enhances Language ModelingCode2
Knowledge Representation Learning: A Quantitative ReviewCode2
A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language ModelCode2
A Survey of Multimodal Large Language Model from A Data-centric PerspectiveCode2
ChatterBox: Multi-round Multimodal Referring and GroundingCode2
KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model ApplicationCode2
Language Model CascadesCode2
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction TuningCode2
ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code GenerationCode2
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language ModelCode2
Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D ScenesCode2
A Survey of Graph Meets Large Language Model: Progress and Future DirectionsCode2
VLKEB: A Large Vision-Language Model Knowledge Editing BenchmarkCode2
KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World KnowledgeCode2
A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information RetrievalCode2
Kani: A Lightweight and Highly Hackable Framework for Building Language Model ApplicationsCode2
Jailbreaking Attack against Multimodal Large Language ModelCode2
Just read twice: closing the recall gap for recurrent language modelsCode2
KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph CompletionCode2
Characterization of Large Language Model Development in the DatacenterCode2
Show:102550
← PrevPage 9 of 284Next →

No leaderboard results yet.