Language Modelling

A language model is a model of natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently using words scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as word n-gram language model.

Source: Wikipedia

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 16951–17000 of 17610 papers

Title	Date	Tasks	Status
Learning Lexical Embeddings with Syntactic and Lexicographic Knowledge	Jul 1, 2015	ChunkingLanguage Modelling	—Unverified
The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models	Jul 1, 2015	Information RetrievalLanguage Modelling	—Unverified
Learning Cross-lingual Word Embeddings via Matrix Co-factorization	Jul 1, 2015	Cross-Lingual Document ClassificationCross-Lingual Word Embeddings	—Unverified
Tackling Sparsity, the Achilles Heel of Social Networks: Language Model Smoothing via Social Regularization	Jul 1, 2015	Language ModelingLanguage Modelling	—Unverified
Generative Incremental Dependency Parsing with Neural Networks	Jul 1, 2015	Dependency ParsingLanguage Modelling	—Unverified
Automatic Identification of Rhetorical Questions	Jul 1, 2015	Document SummarizationLanguage Modelling	—Unverified
Improving Pivot Translation by Remembering the Pivot	Jul 1, 2015	Language ModellingMachine Translation	—Unverified
Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction	Jul 1, 2015	Bilingual Lexicon InductionLanguage Modelling	—Unverified
Entity Retrieval via Entity Factoid Hierarchy	Jul 1, 2015	Entity RetrievalLanguage Modelling	—Unverified
Inducing Word and Part-of-Speech with Pitman-Yor Hidden Semi-Markov Models	Jul 1, 2015	Chinese Word SegmentationLanguage Modelling	—Unverified
genCNN: A Convolutional Architecture for Word Sequence Prediction	Jul 1, 2015	Language ModellingMachine Translation	—Unverified
Deep Markov Neural Network for Sequential Data Classification	Jul 1, 2015	ClassificationGeneral Classification	—Unverified
Language Identification and Modeling in Specialized Hardware	Jul 1, 2015	Language IdentificationLanguage Modelling	—Unverified
Learning Semantic Word Embeddings based on Ordinal Knowledge Constraints	Jul 1, 2015	Language ModellingMachine Translation	—Unverified
Non-Linear Text Regression with a Deep Convolutional Neural Network	Jul 1, 2015	Feature EngineeringLanguage Modelling	—Unverified
Trans-dimensional Random Fields for Language Modeling	Jul 1, 2015	Information RetrievalLanguage Modeling	—Unverified
Vector-space calculation of semantic surprisal for predicting word pronunciation duration	Jul 1, 2015	Language Modelling	—Unverified
Unsupervised Prediction of Acceptability Judgements	Jul 1, 2015	Language ModellingMachine Translation	—Unverified
Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals	Jun 22, 2015	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Recognize Foreign Low-Frequency Words with Similar Pairs	Jun 16, 2015	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Author Identification using Multi-headed Recurrent Neural Networks	Jun 16, 2015	Language ModelingLanguage Modelling	CodeCode Available
A Bayesian Model for Generative Transition-based Dependency Parsing	Jun 13, 2015	Dependency ParsingLanguage Modeling	—Unverified
Modeling Order in Neural Word Embeddings at Scale	Jun 8, 2015	Language ModelingLanguage Modelling	—Unverified
A Hybrid Model for Enhancing Lexical Statistical Machine Translation (SMT)	Jun 3, 2015	Language ModelingLanguage Modelling	—Unverified
Personalizing Universal Recurrent Neural Network Language Model with User Characteristic Features by Social Network Crowdsouring	Jun 3, 2015	Language ModelingLanguage Modelling	—Unverified
Modeling fMRI time courses with linguistic structure at various grain sizes	Jun 1, 2015	Language Modelling	—Unverified
Predicting Prepositions for SMT	Jun 1, 2015	Language ModellingMachine Translation	—Unverified
Morpho-syntactic Regularities in Continuous Word Representations: A multilingual study.	Jun 1, 2015	Language Modelling	—Unverified
Audience size and contextual effects on information density in Twitter conversations	Jun 1, 2015	Language Modelling	—Unverified
Dependency Link Embeddings: Continuous Representations of Syntactic Substructures	Jun 1, 2015	Dependency ParsingLanguage Modelling	—Unverified
Candidate evaluation strategies for improved difficulty prediction of language tests	Jun 1, 2015	Language Modelling	—Unverified
Vector Space Models for Scientific Document Summarization	Jun 1, 2015	Dimensionality ReductionDocument Summarization	—Unverified
Utility-based evaluation metrics for models of language acquisition: A look at speech segmentation	Jun 1, 2015	Language AcquisitionLanguage Modelling	—Unverified
UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification	Jun 1, 2015	ClassificationGeneral Classification	—Unverified
Voltron: A Hybrid System For Answer Validation Based On Lexical And Distance Features	Jun 1, 2015	Answer SelectionCommunity Question Answering	—Unverified
Leveraging Preposition Ambiguity to Assess Compositional Distributional Models of Semantics	Jun 1, 2015	Language Modelling	—Unverified
Graph-based Coherence Modeling For Assessing Readability	Jun 1, 2015	Language ModellingQuestion Answering	—Unverified
Cache-Augmented Latent Topic Language Models for Speech Retrieval	Jun 1, 2015	Language ModellingRetrieval	—Unverified
Sequence-to-Sequence Neural Net Models for Grapheme-to-Phoneme Conversion	May 31, 2015	Grapheme-to-Phoneme ConversionImage Captioning	—Unverified
The IBM 2015 English Conversational Telephone Speech Recognition System	May 21, 2015	Language ModelingLanguage Modelling	—Unverified
Location Prediction of Social Images via Generative Model	May 15, 2015	Language ModelingLanguage Modelling	—Unverified
Language Models for Image Captioning: The Quirks and What Works	May 7, 2015	Image CaptioningLanguage Modeling	—Unverified
Sequence to Sequence -- Video to Text	May 3, 2015	Caption GenerationLanguage Modeling	CodeCode Available
Highway Networks	May 3, 2015	Language Modelling	CodeCode Available
Dynamic Terminology Integration Methods in Statistical Machine Translation	May 1, 2015	Domain AdaptationLanguage Modelling	—Unverified
Document-Level Machine Translation with Word Vector Models	May 1, 2015	Document Level Machine TranslationLanguage Modelling	—Unverified
Unsupervised training of maximum-entropy models for lexical selection in rule-based machine translation	May 1, 2015	Language ModellingMachine Translation	—Unverified
Target-Side Generation of Prepositions for SMT	May 1, 2015	Language ModellingMachine Translation	—Unverified
Smart Computer Aided Translation Environment - SCATE	May 1, 2015	Language ModellingSpeech Recognition	—Unverified
Automatic conversion of colloquial Finnishto standard Finnish	May 1, 2015	Language ModellingMachine Translation	—Unverified

Show:10 25 50

← PrevPage 340 of 353Next →

All datasets WikiText-103 Penn Treebank (Word Level)enwik8 The Pile WikiText-2 LAMBADA One Billion Word Text8 Penn Treebank (Character Level)Hutter Prize OpenWebText SALMon

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Decay RNN	Validation perplexity	76.67	—	Unverified
2	GRU	Validation perplexity	53.78	—	Unverified
3	LSTM	Validation perplexity	52.73	—	Unverified
4	LSTM	Test perplexity	48.7	—	Unverified
5	Temporal CNN	Test perplexity	45.2	—	Unverified
6	TCN	Test perplexity	45.19	—	Unverified
7	GCNN-8	Test perplexity	44.9	—	Unverified
8	Neural cache model (size = 100)	Test perplexity	44.8	—	Unverified
9	Neural cache model (size = 2,000)	Test perplexity	40.8	—	Unverified
10	GPT-2 Small	Test perplexity	37.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TCN	Test perplexity	108.47	—	Unverified
2	Seq-U-Net	Test perplexity	107.95	—	Unverified
3	GRU (Bai et al., 2018)	Test perplexity	92.48	—	Unverified
4	R-Transformer	Test perplexity	84.38	—	Unverified
5	Zaremba et al. (2014) - LSTM (medium)	Test perplexity	82.7	—	Unverified
6	Gal & Ghahramani (2016) - Variational LSTM (medium)	Test perplexity	79.7	—	Unverified
7	LSTM (Bai et al., 2018)	Test perplexity	78.93	—	Unverified
8	Zaremba et al. (2014) - LSTM (large)	Test perplexity	78.4	—	Unverified
9	Gal & Ghahramani (2016) - Variational LSTM (large)	Test perplexity	75.2	—	Unverified
10	Inan et al. (2016) - Variational RHN	Test perplexity	66	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LSTM (7 layers)	Bit per Character (BPC)	1.67	—	Unverified
2	Hypernetworks	Bit per Character (BPC)	1.34	—	Unverified
3	SHA-LSTM (4 layers, h=1024, no attention head)	Bit per Character (BPC)	1.33	—	Unverified
4	LN HM-LSTM	Bit per Character (BPC)	1.32	—	Unverified
5	ByteNet	Bit per Character (BPC)	1.31	—	Unverified
6	Recurrent Highway Networks	Bit per Character (BPC)	1.27	—	Unverified
7	Large FS-LSTM-4	Bit per Character (BPC)	1.25	—	Unverified
8	Large mLSTM	Bit per Character (BPC)	1.24	—	Unverified
9	AWD-LSTM (3 layers)	Bit per Character (BPC)	1.23	—	Unverified
10	Cluster-Former (#C=512)	Bit per Character (BPC)	1.22	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Smaller Transformer 126M (pre-trained)	Test perplexity	33	—	Unverified
2	OPT 125M	Test perplexity	32.26	—	Unverified
3	Larger Transformer 771M (pre-trained)	Test perplexity	28.1	—	Unverified
4	OPT 1.3B	Test perplexity	19.55	—	Unverified
5	GPT-Neo 125M	Test perplexity	17.83	—	Unverified
6	OPT 2.7B	Test perplexity	17.81	—	Unverified
7	Smaller Transformer 126M (fine-tuned)	Test perplexity	12	—	Unverified
8	GPT-Neo 1.3B	Test perplexity	11.46	—	Unverified
9	Transformer 125M	Test perplexity	10.7	—	Unverified
10	GPT-Neo 2.7B	Test perplexity	10.44	—	Unverified