SOTAVerified

Language Modeling

Papers

Showing 150 of 14182 papers

TitleStatusHype
DeepSeek-V3 Technical ReportCode16
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversionCode15
Optimizing Instructions and Demonstrations for Multi-Stage Language Model ProgramsCode14
The AI Scientist: Towards Fully Automated Open-Ended Scientific DiscoveryCode11
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech SystemCode11
Scaling Synthetic Data Creation with 1,000,000,000 PersonasCode11
olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language ModelsCode11
SWE-agent: Agent-Computer Interfaces Enable Automated Software EngineeringCode11
TinyLlama: An Open-Source Small Language ModelCode11
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code IntelligenceCode11
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and GenerationCode11
Pixtral 12BCode11
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space DualityCode11
PowerInfer-2: Fast Large Language Model Inference on a SmartphoneCode9
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-TuningCode9
Natural language guidance of high-fidelity text-to-speech with synthetic annotationsCode9
Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code UnderstandingCode9
OLMo: Accelerating the Science of Language ModelsCode9
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge FusionCode9
Yi: Open Foundation Models by 01.AICode9
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the WildCode9
s1: Simple test-time scalingCode9
Moshi: a speech-text foundation model for real-time dialogueCode9
OpenELM: An Efficient Language Model Family with Open Training and Inference FrameworkCode9
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language ModelCode9
YOLO-World: Real-Time Open-Vocabulary Object DetectionCode9
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language ModelCode9
RWKV-7 "Goose" with Expressive Dynamic State EvolutionCode9
LawGPT: A Chinese Legal Knowledge-Enhanced Large Language ModelCode9
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code IntelligenceCode9
Visually Descriptive Language Model for Vector Graphics ReasoningCode9
Language agents achieve superhuman synthesis of scientific knowledgeCode9
Arcee's MergeKit: A Toolkit for Merging Large Language ModelsCode9
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language ModelsCode9
Perception Encoder: The best visual embeddings are not at the output of the networkCode8
Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech RecognitionCode8
Large Language Model Agent: A Survey on Methodology, Applications and ChallengesCode7
AutoTrain: No-code training for state-of-the-art modelsCode7
AudioLM: a Language Modeling Approach to Audio GenerationCode7
Chronos: Learning the Language of Time SeriesCode7
MagicQuill: An Intelligent Interactive Image Editing SystemCode7
FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model ServingCode7
Large Concept Models: Language Modeling in a Sentence Representation SpaceCode7
PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented GenerationCode7
Dynamic data sampler for cross-language transfer learning in large language modelsCode7
DSPy: Compiling Declarative Language Model Calls into Self-Improving PipelinesCode7
EAGLE: Speculative Sampling Requires Rethinking Feature UncertaintyCode7
Neural Codec Language Models are Zero-Shot Text to Speech SynthesizersCode7
Chinese-Vicuna: A Chinese Instruction-following Llama-based ModelCode7
Labeling supervised fine-tuning data with the scaling lawCode7
Show:102550
← PrevPage 1 of 284Next →

No leaderboard results yet.