Language Modelling

A language model is a model of natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently using words scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as word n-gram language model.

Source: Wikipedia

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 17151–17200 of 17610 papers

Title	Date	Tasks	Status
CoNLL 2014 Shared Task: Grammatical Error Correction with a Syntactic N-gram Language Model from a Big Corpora	Jun 1, 2014	Grammatical Error CorrectionLanguage Modeling	—Unverified
Bilingually-constrained Phrase Embeddings for Machine Translation	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Correcting Preposition Errors in Learner English Using Error Case Frames and Feedback Messages	Jun 1, 2014	Grammatical Error CorrectionLanguage Modelling	—Unverified
Edinburgh's Phrase-based Machine Translation Systems for WMT-14	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Grammatical error correction using hybrid systems and type filtering	Jun 1, 2014	Grammatical Error CorrectionLanguage Modelling	—Unverified
Cross-lingual Model Transfer Using Feature Representation Projection	Jun 1, 2014	Language Modellingmodel	—Unverified
Improving Lexical Embeddings with Semantic Knowledge	Jun 1, 2014	Language ModellingLearning Word Embeddings	CodeCode Available
Automatic Transliteration of Romanized Dialectal Arabic	Jun 1, 2014	Language ModellingSpelling Correction	—Unverified
Building and Evaluating Somali Language Corpora	Jun 1, 2014	Language Modelling	—Unverified
Alex: Bootstrapping a Spoken Dialogue System for a New Domain by Real Users	Jun 1, 2014	Dialogue ManagementLanguage Modelling	—Unverified
A Hybrid Approach to Skeleton-based Translation	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Free on-line speech recogniser based on Kaldi ASR toolkit producing word posterior lattices	Jun 1, 2014	Acoustic ModellingLanguage Modelling	—Unverified
Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models	Jun 1, 2014	DecoderLanguage Modelling	—Unverified
A Provably Correct Learning Algorithm for Latent-Variable PCFGs	Jun 1, 2014	Language ModellingTopic Models	—Unverified
FBK-UPV-UEdin participation in the WMT14 Quality Estimation shared-task	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Fast and Robust Neural Network Joint Models for Statistical Machine Translation	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Biases in Predicting the Human Language Model	Jun 1, 2014	Language ModelingLanguage Modelling	—Unverified
Dependency-Based Word Embeddings	Jun 1, 2014	Language ModellingWord Embeddings	—Unverified
A Generalized Language Model as the Combination of Skipped n-grams and Modified Kneser Ney Smoothing	Jun 1, 2014	Language ModelingLanguage Modelling	CodeCode Available
A Unified Framework for Grammar Error Correction	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Faster Phrase-Based Decoding by Refining Feature State	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Effective Selection of Translation Model Training Data	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Domain Adaptation for Medical Text Translation using Web Resources	Jun 1, 2014	Domain AdaptationLanguage Modelling	—Unverified
A Recursive Recurrent Neural Network for Statistical Machine Translation	Jun 1, 2014	ChunkingLanguage Modelling	—Unverified
Combining Domain Adaptation Approaches for Medical Text Translation	Jun 1, 2014	Domain AdaptationLanguage Modelling	—Unverified
DCU-Lingo24 Participation in WMT 2014 Hindi-English Translation task	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
New Directions in Vector Space Models of Meaning	Jun 1, 2014	Document ClassificationLanguage Modelling	—Unverified
The DCU-ICTCAS MT system at WMT 2014 on German-English Translation Task	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
POSTECH Grammatical Error Correction System in the CoNLL-2014 Shared Task	Jun 1, 2014	Grammatical Error CorrectionLanguage Modelling	—Unverified
The Karlsruhe Institute of Technology Translation Systems for the WMT 2014	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
The AMU System in the CoNLL-2014 Shared Task: Grammatical Error Correction by Data-Intensive and Feature-Rich Statistical Machine Translation	Jun 1, 2014	Grammatical Error CorrectionLanguage Modelling	—Unverified
Postech's System Description for Medical Text Translation Task	Jun 1, 2014	Information RetrievalLanguage Modelling	—Unverified
Translation Assistance by Translation of L1 Fragments in an L2 Context	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Predicting Grammaticality on an Ordinal Scale	Jun 1, 2014	ArticlesAutomated Essay Scoring	—Unverified
The KIT-LIMSI Translation System for WMT 2014	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Kneser-Ney Smoothing on Expected Counts	Jun 1, 2014	Domain AdaptationLanguage Modelling	CodeCode Available
Target-Centric Features for Translation Quality Estimation	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Syllable and language model based features for detecting non-scorable tests in spoken language proficiency assessment applications	Jun 1, 2014	Language ModelingLanguage Modelling	—Unverified
Large-scale Exact Decoding: The IMS-TTT submission to WMT14	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Probabilistic Modeling of Joint-context in Distributional Similarity	Jun 1, 2014	Language ModellingSemantic Textual Similarity	—Unverified
Lattice Desegmentation for Statistical Machine Translation	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
LIMSI @ WMT'14 Medical Translation Task	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Normalizing tweets with edit scripts and recurrent neural embeddings	Jun 1, 2014	Boundary DetectionLanguage Modelling	—Unverified
Towards End-To-End Speech Recognition with Recurrent Neural Networks	Jun 1, 2014	Language ModelingLanguage Modelling	—Unverified
Stanford University's Submissions to the WMT 2014 Translation Task	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Parallel FDA5 for Fast Deployment of Accurate Statistical Machine Translation Systems	Jun 1, 2014	Active LearningLanguage Modelling	—Unverified
Tuning a Grammar Correction System for Increased Precision	Jun 1, 2014	Grammatical Error CorrectionLanguage Modelling	—Unverified
Towards Temporal Scoping of Relational Facts based on Wikipedia Data	Jun 1, 2014	Entity LinkingKnowledge Base Population	—Unverified
Phrasal: A Toolkit for New Directions in Statistical Machine Translation	Jun 1, 2014	Language ModellingMachine Translation	—Unverified
Linguistic Structured Sparsity in Text Categorization	Jun 1, 2014	Feature EngineeringLanguage Modelling	—Unverified

Show:10 25 50

← PrevPage 344 of 353Next →

All datasets WikiText-103 Penn Treebank (Word Level)enwik8 The Pile WikiText-2 LAMBADA One Billion Word Text8 Penn Treebank (Character Level)Hutter Prize OpenWebText SALMon

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Decay RNN	Validation perplexity	76.67	—	Unverified
2	GRU	Validation perplexity	53.78	—	Unverified
3	LSTM	Validation perplexity	52.73	—	Unverified
4	LSTM	Test perplexity	48.7	—	Unverified
5	Temporal CNN	Test perplexity	45.2	—	Unverified
6	TCN	Test perplexity	45.19	—	Unverified
7	GCNN-8	Test perplexity	44.9	—	Unverified
8	Neural cache model (size = 100)	Test perplexity	44.8	—	Unverified
9	Neural cache model (size = 2,000)	Test perplexity	40.8	—	Unverified
10	GPT-2 Small	Test perplexity	37.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TCN	Test perplexity	108.47	—	Unverified
2	Seq-U-Net	Test perplexity	107.95	—	Unverified
3	GRU (Bai et al., 2018)	Test perplexity	92.48	—	Unverified
4	R-Transformer	Test perplexity	84.38	—	Unverified
5	Zaremba et al. (2014) - LSTM (medium)	Test perplexity	82.7	—	Unverified
6	Gal & Ghahramani (2016) - Variational LSTM (medium)	Test perplexity	79.7	—	Unverified
7	LSTM (Bai et al., 2018)	Test perplexity	78.93	—	Unverified
8	Zaremba et al. (2014) - LSTM (large)	Test perplexity	78.4	—	Unverified
9	Gal & Ghahramani (2016) - Variational LSTM (large)	Test perplexity	75.2	—	Unverified
10	Inan et al. (2016) - Variational RHN	Test perplexity	66	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSTM (7 layers)	Bit per Character (BPC)	1.67	—	Unverified
2	Hypernetworks	Bit per Character (BPC)	1.34	—	Unverified
3	SHA-LSTM (4 layers, h=1024, no attention head)	Bit per Character (BPC)	1.33	—	Unverified
4	LN HM-LSTM	Bit per Character (BPC)	1.32	—	Unverified
5	ByteNet	Bit per Character (BPC)	1.31	—	Unverified
6	Recurrent Highway Networks	Bit per Character (BPC)	1.27	—	Unverified
7	Large FS-LSTM-4	Bit per Character (BPC)	1.25	—	Unverified
8	Large mLSTM	Bit per Character (BPC)	1.24	—	Unverified
9	AWD-LSTM (3 layers)	Bit per Character (BPC)	1.23	—	Unverified
10	Cluster-Former (#C=512)	Bit per Character (BPC)	1.22	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Smaller Transformer 126M (pre-trained)	Test perplexity	33	—	Unverified
2	OPT 125M	Test perplexity	32.26	—	Unverified
3	Larger Transformer 771M (pre-trained)	Test perplexity	28.1	—	Unverified
4	OPT 1.3B	Test perplexity	19.55	—	Unverified
5	GPT-Neo 125M	Test perplexity	17.83	—	Unverified
6	OPT 2.7B	Test perplexity	17.81	—	Unverified
7	Smaller Transformer 126M (fine-tuned)	Test perplexity	12	—	Unverified
8	GPT-Neo 1.3B	Test perplexity	11.46	—	Unverified
9	Transformer 125M	Test perplexity	10.7	—	Unverified
10	GPT-Neo 2.7B	Test perplexity	10.44	—	Unverified