SOTAVerified

Language Modeling

Papers

Showing 351400 of 14182 papers

TitleStatusHype
MultiModal-GPT: A Vision and Language Model for Dialogue with HumansCode3
Multi-agent Architecture Search via Agentic SupernetCode3
Multimodal Table UnderstandingCode3
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language ModelsCode3
Multi-objective Asynchronous Successive HalvingCode3
Ola: Pushing the Frontiers of Omni-Modal Language ModelCode3
PGL at TextGraphs 2020 Shared Task: Explanation Regeneration using Language and Graph Learning MethodsCode3
Predicting from Strings: Language Model Embeddings for Bayesian OptimizationCode3
Datasheet for the PileCode3
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile DevicesCode3
Prompt-to-LeaderboardCode3
Pushing the Limits of Large Language Model Quantization via the Linearity TheoremCode3
Data Filtering NetworksCode3
MeshXL: Neural Coordinate Field for Generative 3D Foundation ModelsCode3
MoMA: Multimodal LLM Adapter for Fast Personalized Image GenerationCode3
Cramming: Training a Language Model on a Single GPU in One DayCode3
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model AgentsCode3
Audio-Reasoner: Improving Reasoning Capability in Large Audio Language ModelsCode3
M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language ModelsCode3
Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text IntegrationCode3
ContextCite: Attributing Model Generation to ContextCode3
Conformer: Convolution-augmented Transformer for Speech RecognitionCode3
AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly DetectionCode3
LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language ModelCode3
Llemma: An Open Language Model For MathematicsCode3
Compact Language Models via Pruning and Knowledge DistillationCode3
Evaluating Large Language Models Trained on CodeCode3
Evalverse: Unified and Accessible Library for Large Language Model EvaluationCode3
Revisiting Pre-Trained Models for Chinese Natural Language ProcessingCode3
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMsCode3
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse AutoencodersCode3
Longformer: The Long-Document TransformerCode3
Agent Workflow MemoryCode3
Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software ImprovementCode3
A Systematic Evaluation of Large Language Models of CodeCode3
Lifelong Learning of Large Language Model based Agents: A RoadmapCode3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache ManagementCode3
LaViDa: A Large Diffusion Language Model for Multimodal UnderstandingCode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
MotionGPT: Human Motion as a Foreign LanguageCode3
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 TrainingCode3
A Survey on the Optimization of Large Language Model-based AgentsCode3
AsymLoRA: Harmonizing Data Conflicts and Commonalities in MLLMsCode3
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient InferenceCode3
Large Language Model based Long-tail Query Rewriting in Taobao SearchCode3
A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback LearningCode3
A Survey on the Memory Mechanism of Large Language Model based AgentsCode3
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language ModelCode3
Language Models are Few-Shot LearnersCode3
Cleaner Pretraining Corpus Curation with Neural Web ScrapingCode3
Show:102550
← PrevPage 8 of 284Next →

No leaderboard results yet.