Language Modelling

A language model is a model of natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently using words scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as word n-gram language model.

Source: Wikipedia

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 8551–8600 of 17610 papers

Title	Date	Tasks	Status
J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling	Jul 22, 2024	Dialogue GenerationLanguage Modeling	—Unverified
J-EDI QA: Benchmark for deep-sea organism-specific multimodal LLM	Dec 20, 2024	Language ModelingLanguage Modelling	—Unverified
Jeff Da at COIN - Shared Task: BIG MOOD: Relating Transformers to Explicit Commonsense Knowledge	Nov 1, 2019	Language ModelingLanguage Modelling	—Unverified
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition	Feb 16, 2023	Language ModelingLanguage Modelling	—Unverified
Jellyfish: A Large Language Model for Data Preprocessing	Dec 4, 2023	GPUImputation	—Unverified
JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis	Jan 9, 2025	Emotion RecognitionLanguage Modeling	—Unverified
JEPA4Rec: Learning Effective Language Representations for Sequential Recommendation via Joint Embedding Predictive Architecture	Apr 10, 2025	Common Sense ReasoningDescriptive	—Unverified
Jet Expansions of Residual Computation	Oct 8, 2024	Language ModelingLanguage Modelling	—Unverified
JHU System Description for the MADAR Arabic Dialect Identification Shared Task	Aug 1, 2019	Dialect IdentificationLanguage Modeling	—Unverified
JIANG: Chinese Open Foundation Language Model	Aug 1, 2023	Language ModelingLanguage Modelling	—Unverified
JingFang: A Traditional Chinese Medicine Large Language Model of Expert-Level Medical Diagnosis and Syndrome Differentiation-Based Treatment	Feb 4, 2025	DiagnosticLanguage Modeling	—Unverified
Jira: a Kurdish Speech Recognition System Designing and Building Speech Corpus and Pronunciation Lexicon	Feb 15, 2021	Language ModellingSentence	—Unverified
JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving	Jun 19, 2023	In-Context LearningLanguage Modeling	—Unverified
基于预训练语言模型的繁体古文自动句读研究(Automatic Traditional Ancient Chinese Texts Segmentation and Punctuation Based on Pre-training Language Model)	Aug 1, 2021	Language Modelling	—Unverified
Joint Action Language Modelling for Transparent Policy Execution	Apr 14, 2025	Language ModellingText Generation	—Unverified
Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation	Dec 4, 2021	Language Modelling	—Unverified
Joint Contextual Modeling for ASR Correction and Language Understanding	Jan 28, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Joint CTC/attention decoding for end-to-end speech recognition	Jul 1, 2017	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Joint Decoding of Tree Transduction Models for Sentence Compression	Oct 1, 2014	Language ModellingSentence	—Unverified
Joint Encoder-Decoder Self-Supervised Pre-training for ASR	Jun 9, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Joint Estimation and Prediction of City-wide Delivery Demand: A Large Language Model Empowered Graph-based Learning Approach	Aug 30, 2024	Language ModelingLanguage Modelling	—Unverified
Joint Extraction of Entity and Relation with Information Redundancy Elimination	Nov 27, 2020	DecoderLanguage Modeling	—Unverified
Joint Language and Translation Modeling with Recurrent Neural Networks	Oct 1, 2013	Language ModellingMachine Translation	—Unverified
Joint Learning of Phonetic Units and Word Pronunciations for ASR	Oct 1, 2013	Language ModellingSpeech Recognition	—Unverified
Jointly Learning Author and Annotated Character N-gram Embeddings: A Case Study in Literary Text	Sep 1, 2019	Authorship AttributionGenre classification	—Unverified
Jointly Learning to Embed and Predict with Multiple Languages	Aug 1, 2016	Cross-Lingual TransferLanguage Modeling	—Unverified
Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures	Oct 1, 2014	Language ModellingSemantic Composition	—Unverified
Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation	Jul 1, 2020	DecoderLanguage Modeling	—Unverified
Jointly Reinforced User Simulator and Task-oriented Dialog System with Simplified Generative Architecture	Jan 16, 2022	Language ModelingLanguage Modelling	—Unverified
Jointly Trained Transformers models for Spoken Language Translation	Apr 25, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Joint Online Spoken Language Understanding and Language Modeling with Recurrent Neural Networks	Sep 6, 2016	BenchmarkingIntent Detection	—Unverified
Joint Part-of-Speech and Language ID Tagging for Code-Switched Data	Jul 1, 2018	Automatic Speech Recognition (ASR)Language Modeling	—Unverified
Joint Recognition of Handwritten Text and Named Entities with a Neural End-to-end Model	Mar 16, 2018	Language ModelingLanguage Modelling	—Unverified
Joint Semantic and Structural Representation Learning for Enhancing User Preference Modelling	Apr 24, 2023	Knowledge GraphsLanguage Modelling	—Unverified
Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility	Sep 14, 2024	Knowledge DistillationLanguage Modeling	—Unverified
Joint Space Neural Probabilistic Language Model for Statistical Machine Translation	Jan 16, 2013	Language ModelingLanguage Modelling	—Unverified
Joint unsupervised and supervised learning for context-aware language identification	Mar 29, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Joint Unsupervised and Supervised Training for Multilingual ASR	Nov 15, 2021	Language ModelingLanguage Modelling	—Unverified
Joint Verification and Refinement of Language Models for Safety-Constrained Planning	Oct 18, 2024	Language ModelingLanguage Modelling	—Unverified
Joint WMT 2013 Submission of the QUAERO Project	Aug 1, 2013	Language ModellingMachine Translation	—Unverified
JPPO: Joint Power and Prompt Optimization for Accelerated Large Language Model Services	Nov 27, 2024	Deep Reinforcement LearningLanguage Modeling	—Unverified
Judgment of Thoughts: Courtroom of the Binary Logical Reasoning in Large Language Models	Sep 25, 2024	Fake News DetectionLanguage Modeling	—Unverified
JU NITM at IJCNLP-2017 Task 5: A Classification Approach for Answer Selection in Multi-choice Question Answering System	Dec 1, 2017	Answer SelectionClassification	—Unverified
Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation for Open-Domain Dialogue	Oct 15, 2021	AllHallucination	—Unverified
Juru: Legal Brazilian Large Language Model from Reputable Sources	Mar 26, 2024	General KnowledgeLanguage Modeling	—Unverified
Just Add Functions: A Neural-Symbolic Language Model	Dec 11, 2019	Inductive BiasLanguage Modeling	—Unverified
JUST at SemEval-2020 Task 11: Detecting Propaganda Techniques Using BERT Pre-trained Model	Dec 1, 2020	ArticlesLanguage Modeling	—Unverified
Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects	Nov 1, 2019	Decision MakingLanguage Modeling	—Unverified
JU-USAAR: A Domain Adaptive MT System	Aug 1, 2016	Domain AdaptationLanguage Modeling	—Unverified
KACE: Generating Knowledge Aware Contrastive Explanations for Natural Language Inference	Aug 1, 2021	counterfactualLanguage Modelling	—Unverified

Show:10 25 50

← PrevPage 172 of 353Next →

All datasets WikiText-103 Penn Treebank (Word Level)enwik8 The Pile WikiText-2 LAMBADA One Billion Word Text8 Penn Treebank (Character Level)Hutter Prize OpenWebText SALMon

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Decay RNN	Validation perplexity	76.67	—	Unverified
2	GRU	Validation perplexity	53.78	—	Unverified
3	LSTM	Validation perplexity	52.73	—	Unverified
4	LSTM	Test perplexity	48.7	—	Unverified
5	Temporal CNN	Test perplexity	45.2	—	Unverified
6	TCN	Test perplexity	45.19	—	Unverified
7	GCNN-8	Test perplexity	44.9	—	Unverified
8	Neural cache model (size = 100)	Test perplexity	44.8	—	Unverified
9	Neural cache model (size = 2,000)	Test perplexity	40.8	—	Unverified
10	GPT-2 Small	Test perplexity	37.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TCN	Test perplexity	108.47	—	Unverified
2	Seq-U-Net	Test perplexity	107.95	—	Unverified
3	GRU (Bai et al., 2018)	Test perplexity	92.48	—	Unverified
4	R-Transformer	Test perplexity	84.38	—	Unverified
5	Zaremba et al. (2014) - LSTM (medium)	Test perplexity	82.7	—	Unverified
6	Gal & Ghahramani (2016) - Variational LSTM (medium)	Test perplexity	79.7	—	Unverified
7	LSTM (Bai et al., 2018)	Test perplexity	78.93	—	Unverified
8	Zaremba et al. (2014) - LSTM (large)	Test perplexity	78.4	—	Unverified
9	Gal & Ghahramani (2016) - Variational LSTM (large)	Test perplexity	75.2	—	Unverified
10	Inan et al. (2016) - Variational RHN	Test perplexity	66	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSTM (7 layers)	Bit per Character (BPC)	1.67	—	Unverified
2	Hypernetworks	Bit per Character (BPC)	1.34	—	Unverified
3	SHA-LSTM (4 layers, h=1024, no attention head)	Bit per Character (BPC)	1.33	—	Unverified
4	LN HM-LSTM	Bit per Character (BPC)	1.32	—	Unverified
5	ByteNet	Bit per Character (BPC)	1.31	—	Unverified
6	Recurrent Highway Networks	Bit per Character (BPC)	1.27	—	Unverified
7	Large FS-LSTM-4	Bit per Character (BPC)	1.25	—	Unverified
8	Large mLSTM	Bit per Character (BPC)	1.24	—	Unverified
9	AWD-LSTM (3 layers)	Bit per Character (BPC)	1.23	—	Unverified
10	Cluster-Former (#C=512)	Bit per Character (BPC)	1.22	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Smaller Transformer 126M (pre-trained)	Test perplexity	33	—	Unverified
2	OPT 125M	Test perplexity	32.26	—	Unverified
3	Larger Transformer 771M (pre-trained)	Test perplexity	28.1	—	Unverified
4	OPT 1.3B	Test perplexity	19.55	—	Unverified
5	GPT-Neo 125M	Test perplexity	17.83	—	Unverified
6	OPT 2.7B	Test perplexity	17.81	—	Unverified
7	Smaller Transformer 126M (fine-tuned)	Test perplexity	12	—	Unverified
8	GPT-Neo 1.3B	Test perplexity	11.46	—	Unverified
9	Transformer 125M	Test perplexity	10.7	—	Unverified
10	GPT-Neo 2.7B	Test perplexity	10.44	—	Unverified