Language Modelling

A language model is a model of natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently using words scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as word n-gram language model.

Source: Wikipedia

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 17051–17100 of 17610 papers

Title	Date	Tasks	Status
Embedding Word Similarity with Neural Machine Translation	Dec 19, 2014	Language ModelingLanguage Modelling	—Unverified
A Simple and Efficient Method To Generate Word Sense Representations	Dec 18, 2014	Language ModelingLanguage Modelling	—Unverified
Deep Structured Output Learning for Unconstrained Text Recognition	Dec 18, 2014	Language ModelingLanguage Modelling	—Unverified
Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation	Dec 3, 2014	Language ModelingLanguage Modelling	—Unverified
使用概念資訊於中文大詞彙連續語音辨識之研究 (Exploring Concept Information for Mandarin Large Vocabulary Continuous Speech Recognition) [In Chinese]	Dec 1, 2014	Language Modellingspeech-recognition	—Unverified
A Hierarchical Word Sequence Language Model	Dec 1, 2014	Language ModelingLanguage Modelling	—Unverified
Incrementally Updating the SMT Reordering Model	Dec 1, 2014	Language ModellingMachine Translation	—Unverified
Modeling Structural Topic Transitions for Automatic Lyrics Generation	Dec 1, 2014	Language Modelling	—Unverified
Transition-based Knowledge Graph Embedding with Relational Mapping Properties	Dec 1, 2014	Graph EmbeddingInformation Retrieval	—Unverified
Zero-Shot Learning of Language Models for Describing Human Actions Based on Semantic Compositionality of Actions	Dec 1, 2014	Language ModellingMachine Translation	—Unverified
Extracting and Selecting Relevant Corpora for Domain Adaptation in MT	Dec 1, 2014	Active LearningDomain Adaptation	—Unverified
LMSim : Computing Domain-specific Semantic Word Similarities Using a Language Modeling Approach	Dec 1, 2014	Information RetrievalLanguage Modeling	—Unverified
Applications of Lexicographic Semirings to Problems in Speech and Language Processing	Dec 1, 2014	Language ModellingPart-Of-Speech Tagging	—Unverified
From Captions to Visual Concepts and Back	Nov 18, 2014	Image CaptioningLanguage Modeling	CodeCode Available
Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models	Nov 10, 2014	DecoderLanguage Modeling	CodeCode Available
The Effect of Dependency Representation Scheme on Syntactic Language Modelling	Nov 1, 2014	Constituency ParsingDependency Parsing	—Unverified
Leveraging known Semantics for Spelling Correction	Nov 1, 2014	Language ModellingSpelling Correction	—Unverified
A random forest system combination approach for error detection in digital dictionaries	Oct 30, 2014	Language ModelingLanguage Modelling	—Unverified
Detecting Structural Irregularity in Electronic Dictionaries Using Language Modeling	Oct 29, 2014	Language ModelingLanguage Modelling	—Unverified
Large Vocabulary Arabic Online Handwriting Recognition System	Oct 17, 2014	Handwriting RecognitionLanguage Modeling	—Unverified
Expanding the Language model in a low-resource hybrid MT system	Oct 1, 2014	Language ModelingLanguage Modelling	—Unverified
Chinese Spelling Error Detection and Correction Based on Language Model, Pronunciation, and Shape	Oct 1, 2014	Language ModelingLanguage Modelling	—Unverified
運用概念模型化技術於中文大詞彙連續語音辨識之語言模型調適 (Leveraging Concept Modeling Techniques for Language Model Adaptation in Mandarin Large Vocabulary Continuous Speech Recognition) [In Chinese]	Oct 1, 2014	Language Modellingspeech-recognition	—Unverified
探究新穎語句模型化技術於節錄式語音摘要 (Investigating Novel Sentence Modeling Techniques for Extractive Speech Summarization) [In Chinese]	Oct 1, 2014	Language ModellingSentence	—Unverified
Japanese to English Machine Translation using Preordering and Compositional Distributed Semantics	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
A machine translation system combining rule-based machine translation and statistical post-editing	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
Forest-to-String SMT for Asian Language Translation: NAIST at WAT 2014	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
Weblio Pre-reordering Statistical Machine Translation System	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
Language variety identification in Spanish tweets	Oct 1, 2014	Language IdentificationLanguage Modelling	—Unverified
Exploration of the Impact of Maximum Entropy in Recurrent Neural Network Language Models for Code-Switching Speech	Oct 1, 2014	Language ModellingSpeech Recognition	—Unverified
A Pipeline Approach to Supervised Error Correction for the QALB-2014 Shared Task	Oct 1, 2014	Grammatical Error CorrectionLanguage Modelling	—Unverified
GWU-HASP: Hybrid Arabic Spelling and Punctuation Corrector	Oct 1, 2014	Language ModellingSpelling Correction	—Unverified
Automatic Correction of Arabic Text: a Cascaded Approach	Oct 1, 2014	Language ModellingTransliteration	—Unverified
CMUQ@QALB-2014: An SMT-based System for Automatic Arabic Error Correction	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
Dependency-Based Bilingual Language Models for Reordering in Statistical Machine Translation	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
Improved Decipherment of Homophonic Ciphers	Oct 1, 2014	DeciphermentLanguage Modelling	—Unverified
Comparing Representations of Semantic Roles for String-To-Tree Decoding	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
Word Translation Prediction for Morphologically Rich Languages with Bilingual Neural Networks	Oct 1, 2014	Feature EngineeringLanguage Modelling	—Unverified
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation	Oct 1, 2014	DecoderLanguage Modelling	—Unverified
Joint Decoding of Tree Transduction Models for Sentence Compression	Oct 1, 2014	Language ModellingSentence	—Unverified
Neural Network Based Bilingual Language Model Growing for Statistical Machine Translation	Oct 1, 2014	Language ModelingLanguage Modelling	—Unverified
Leveraging Effective Query Modeling Techniques for Speech Recognition and Summarization	Oct 1, 2014	Document RankingInformation Retrieval	—Unverified
Language Modeling with Functional Head Constraint for Code Switching Speech Recognition	Oct 1, 2014	Language IdentificationLanguage Modeling	—Unverified
Morphological Segmentation for Keyword Spotting	Oct 1, 2014	Keyword SpottingLanguage Modelling	—Unverified
The Inside-Outside Recursive Neural Network model for Dependency Parsing	Oct 1, 2014	Dependency ParsingLanguage Modelling	—Unverified
Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures	Oct 1, 2014	Language ModellingSemantic Composition	—Unverified
PCFG Induction for Unsupervised Parsing and Language Modelling	Oct 1, 2014	Language Modelling	—Unverified
Submodularity for Data Selection in Machine Translation	Oct 1, 2014	Language ModellingMachine Translation	—Unverified
Composition of Word Representations Improves Semantic Role Labelling	Oct 1, 2014	Language Modelling	—Unverified
Exact Decoding for Phrase-Based Statistical Machine Translation	Oct 1, 2014	Language ModellingMachine Translation	—Unverified

Show:10 25 50

← PrevPage 342 of 353Next →

All datasets WikiText-103 Penn Treebank (Word Level)enwik8 The Pile WikiText-2 LAMBADA One Billion Word Text8 Penn Treebank (Character Level)Hutter Prize OpenWebText SALMon

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Decay RNN	Validation perplexity	76.67	—	Unverified
2	GRU	Validation perplexity	53.78	—	Unverified
3	LSTM	Validation perplexity	52.73	—	Unverified
4	LSTM	Test perplexity	48.7	—	Unverified
5	Temporal CNN	Test perplexity	45.2	—	Unverified
6	TCN	Test perplexity	45.19	—	Unverified
7	GCNN-8	Test perplexity	44.9	—	Unverified
8	Neural cache model (size = 100)	Test perplexity	44.8	—	Unverified
9	Neural cache model (size = 2,000)	Test perplexity	40.8	—	Unverified
10	GPT-2 Small	Test perplexity	37.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TCN	Test perplexity	108.47	—	Unverified
2	Seq-U-Net	Test perplexity	107.95	—	Unverified
3	GRU (Bai et al., 2018)	Test perplexity	92.48	—	Unverified
4	R-Transformer	Test perplexity	84.38	—	Unverified
5	Zaremba et al. (2014) - LSTM (medium)	Test perplexity	82.7	—	Unverified
6	Gal & Ghahramani (2016) - Variational LSTM (medium)	Test perplexity	79.7	—	Unverified
7	LSTM (Bai et al., 2018)	Test perplexity	78.93	—	Unverified
8	Zaremba et al. (2014) - LSTM (large)	Test perplexity	78.4	—	Unverified
9	Gal & Ghahramani (2016) - Variational LSTM (large)	Test perplexity	75.2	—	Unverified
10	Inan et al. (2016) - Variational RHN	Test perplexity	66	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSTM (7 layers)	Bit per Character (BPC)	1.67	—	Unverified
2	Hypernetworks	Bit per Character (BPC)	1.34	—	Unverified
3	SHA-LSTM (4 layers, h=1024, no attention head)	Bit per Character (BPC)	1.33	—	Unverified
4	LN HM-LSTM	Bit per Character (BPC)	1.32	—	Unverified
5	ByteNet	Bit per Character (BPC)	1.31	—	Unverified
6	Recurrent Highway Networks	Bit per Character (BPC)	1.27	—	Unverified
7	Large FS-LSTM-4	Bit per Character (BPC)	1.25	—	Unverified
8	Large mLSTM	Bit per Character (BPC)	1.24	—	Unverified
9	AWD-LSTM (3 layers)	Bit per Character (BPC)	1.23	—	Unverified
10	Cluster-Former (#C=512)	Bit per Character (BPC)	1.22	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Smaller Transformer 126M (pre-trained)	Test perplexity	33	—	Unverified
2	OPT 125M	Test perplexity	32.26	—	Unverified
3	Larger Transformer 771M (pre-trained)	Test perplexity	28.1	—	Unverified
4	OPT 1.3B	Test perplexity	19.55	—	Unverified
5	GPT-Neo 125M	Test perplexity	17.83	—	Unverified
6	OPT 2.7B	Test perplexity	17.81	—	Unverified
7	Smaller Transformer 126M (fine-tuned)	Test perplexity	12	—	Unverified
8	GPT-Neo 1.3B	Test perplexity	11.46	—	Unverified
9	Transformer 125M	Test perplexity	10.7	—	Unverified
10	GPT-Neo 2.7B	Test perplexity	10.44	—	Unverified