Language Modelling

A language model is a model of natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently using words scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as word n-gram language model.

Source: Wikipedia

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 17610 papers

Title	Date	Tasks	Status	Hype
Visual-Language Model Knowledge Distillation Method for Image Quality Assessment	Jul 21, 2025	Image Quality AssessmentKnowledge Distillation	—Unverified	0
The Generative Energy Arena (GEA): Incorporating Energy Awareness in Large Language Model (LLM) Human Evaluations	Jul 17, 2025	Language ModelingLanguage Modelling	—Unverified	0
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities	Jul 17, 2025	Language ModelingLanguage Modelling	—Unverified	0
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning	Jul 17, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
Making Language Model a Hierarchical Classifier and Generator	Jul 17, 2025	DecoderLanguage Modeling	CodeCode Available	0
Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility	Jul 16, 2025	Language ModelingLanguage Modelling	—Unverified	0
Describe Anything Model for Visual Question Answering on Text-rich Images	Jul 16, 2025	DescriptiveLanguage Modeling	CodeCode Available	1
Assay2Mol: large language model-based drug design using BioAssay context	Jul 16, 2025	DescriptiveDrug Design	CodeCode Available	0
InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing	Jul 16, 2025	Domain GeneralizationFace Anti-Spoofing	CodeCode Available	1
KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning?	Jul 15, 2025	GSM8KLanguage Modeling	—Unverified	0

Show:10 25 50

← PrevPage 1 of 1761Next →

All datasets WikiText-103 Penn Treebank (Word Level)enwik8 The Pile WikiText-2 LAMBADA One Billion Word Text8 Penn Treebank (Character Level)Hutter Prize OpenWebText SALMon

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	OPT-175B (50% Sparsity)	Test perplexity	234.77	—	Unverified
2	Grave et al. (2016) - LSTM	Test perplexity	99.3	—	Unverified
3	Inan et al. (2016) - Variational LSTM (tied) (h=650)	Test perplexity	87.7	—	Unverified
4	Inan et al. (2016) - Variational LSTM (tied) (h=650) + augmented loss	Test perplexity	87	—	Unverified
5	Grave et al. (2016) - LSTM + continuous cache pointer	Test perplexity	68.9	—	Unverified
6	EGRU	Test perplexity	68.9	—	Unverified
7	Melis et al. (2017) - 1-layer LSTM (tied)	Test perplexity	65.9	—	Unverified
8	AWD-LSTM	Test perplexity	65.8	—	Unverified
9	AWD-LSTM + ATOI	Test perplexity	64.73	—	Unverified
10	AWD-LSTM 3-layer with Fraternal dropout	Test perplexity	64.1	—	Unverified