SOTAVerified

Language Modeling

Papers

Showing 401450 of 14182 papers

TitleStatusHype
8-bit Optimizers via Block-wise QuantizationCode3
SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model InferenceCode3
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMsCode3
LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language ModelCode3
Compact Language Models via Pruning and Knowledge DistillationCode3
GLM: General Language Model Pretraining with Autoregressive Blank InfillingCode3
Lifelong Learning of Large Language Model based Agents: A RoadmapCode3
Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software ImprovementCode3
Llemma: An Open Language Model For MathematicsCode3
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language ModelCode3
SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model CompressionCode3
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient InferenceCode3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache ManagementCode3
Large Language Model-Brained GUI Agents: A SurveyCode3
LaViDa: A Large Diffusion Language Model for Multimodal UnderstandingCode3
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 TrainingCode3
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality DataCode3
Cleaner Pretraining Corpus Curation with Neural Web ScrapingCode3
The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling CapabilitiesCode3
A Comprehensive Survey on Long Context Language ModelingCode3
Large Language Model based Long-tail Query Rewriting in Taobao SearchCode3
Language Models are Few-Shot LearnersCode3
Language Model InversionCode3
Data Filtering NetworksCode3
On the Efficiency of NLP-Inspired Methods for Tabular Deep LearningCode3
LaMI-DETR: Open-Vocabulary Detection with Language Model InstructionCode2
LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity EnhancementCode2
KV Shifting Attention Enhances Language ModelingCode2
Knowledge Representation Learning: A Quantitative ReviewCode2
ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code GenerationCode2
Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D ScenesCode2
KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model ApplicationCode2
KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World KnowledgeCode2
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language ModelCode2
KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph CompletionCode2
Characterization of Large Language Model Development in the DatacenterCode2
VLKEB: A Large Vision-Language Model Knowledge Editing BenchmarkCode2
Knowledge Circuits in Pretrained TransformersCode2
Jailbreaking Attack against Multimodal Large Language ModelCode2
Just read twice: closing the recall gap for recurrent language modelsCode2
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction TuningCode2
ChatterBox: Multi-round Multimodal Referring and GroundingCode2
Kani: A Lightweight and Highly Hackable Framework for Building Language Model ApplicationsCode2
Language Model CascadesCode2
Cedille: A large autoregressive French language modelCode2
Introducing Visual Perception Token into Multimodal Large Language ModelCode2
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language ModelCode2
Inference-Time Intervention: Eliciting Truthful Answers from a Language ModelCode2
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model AgentsCode2
Large Language Model Instruction Following: A Survey of Progresses and ChallengesCode2
Show:102550
← PrevPage 9 of 284Next →

No leaderboard results yet.