SOTAVerified

Language Modelling

A language model is a model of natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently using words scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as word n-gram language model.

Source: Wikipedia

Papers

Showing 64516500 of 17610 papers

TitleStatusHype
Automated Story Generation as Question-Answering0
Domain Adaptation for Code Model-based Unit Test Case Generation0
Automated Testing of COBOL to Java Transformation0
Automated Text Mining of Experimental Methodologies from Biomedical Literature0
Automated Theorem Provers Help Improve Large Language Model Reasoning0
Automated User Story Generation with Test Case Specification Using Large Language Model0
Automated Word Prediction in Bangla Language Using Stochastic Language Models0
Automatically Detecting Online Deceptive Patterns in Real-time0
Automatically Generating Rhythmic Verse with Neural Networks0
Automatically Generating Rules of Malicious Software Packages via Large Language Model0
Automatic Argument Quality Assessment -- New Datasets and Methods0
Automatic Argument Quality Assessment - New Datasets and Methods0
Automatic Assessment of Divergent Thinking in Chinese Language with TransDis: A Transformer-Based Language Model Approach0
Automatic Assessment of Oral Reading Accuracy for Reading Diagnostics0
Automatic Assistance for Academic Word Usage0
Automatic Business Process Structure Discovery using Ordered Neurons LSTM: A Preliminary Study0
Pareto Optimal Learning for Estimating Large Language Model Errors0
Automatic Chord Recognition with Higher-Order Harmonic Language Modelling0
Automatic coding of students' writing via Contrastive Representation Learning in the Wasserstein space0
Automatic Conditional Generation of Personalized Social Media Short Texts0
Automatic Construction of Discourse Corpora for Dialogue Translation0
Automatic Control With Human-Like Reasoning: Exploring Language Model Embodied Air Traffic Agents0
Automatic conversion of colloquial Finnishto standard Finnish0
Automatic Correction of Arabic Text: a Cascaded Approach0
Automatic Data Expansion for Customer-care Spoken Language Understanding0
Automatic Demonstration Selection for LLM-based Tabular Data Classification0
Automatic Detection of Borrowings in Low-Resource Languages of the Caucasus: Andic branch0
Automatic detection of diseases in Spanish clinical notes combining medical language models and ontologies0
Automatic Dialect Density Estimation for African American English0
Automatic Documentation of ICD Codes with Far-Field Speech Recognition0
Automatic Extraction of Personality from Text: Challenges and Opportunities0
Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature0
Automatic Feature Learning for Essence: a Case Study on Car Sequencing0
Automatic Generation of Programming Exercises and Code Explanations using Large Language Models0
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks0
Automatic High-quality Verilog Assertion Generation through Subtask-Focused Fine-Tuned LLMs and Iterative Prompting0
Automatic Identification of Arabic Language Varieties and Dialects in Social Media0
Automatic Identification of Rhetorical Questions0
Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models0
Automatic Item Generation for Personality Situational Judgment Tests with Large Language Models0
Automatic Knowledge Augmentation for Generative Commonsense Reasoning0
Automatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish0
Automatic Learning of Subword Dependent Model Scales0
Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech0
Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language Model0
Automatic Multi-Label Prompting: Simple and Interpretable Few-Shot Classification0
Automatic Myanmar Image Captioning using CNN and LSTM-Based Language Model0
Automatic Nominalization of Clauses0
Automatic, Personalized, and Flexible Playlist Generation using Reinforcement Learning0
Automatic Poetry Generation from Prosaic Text0
Show:102550
← PrevPage 130 of 353Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Decay RNNValidation perplexity76.67Unverified
2GRUValidation perplexity53.78Unverified
3LSTMValidation perplexity52.73Unverified
4LSTMTest perplexity48.7Unverified
5Temporal CNNTest perplexity45.2Unverified
6TCNTest perplexity45.19Unverified
7GCNN-8Test perplexity44.9Unverified
8Neural cache model (size = 100)Test perplexity44.8Unverified
9Neural cache model (size = 2,000)Test perplexity40.8Unverified
10GPT-2 SmallTest perplexity37.5Unverified
#ModelMetricClaimedVerifiedStatus
1TCNTest perplexity108.47Unverified
2Seq-U-NetTest perplexity107.95Unverified
3GRU (Bai et al., 2018)Test perplexity92.48Unverified
4R-TransformerTest perplexity84.38Unverified
5Zaremba et al. (2014) - LSTM (medium)Test perplexity82.7Unverified
6Gal & Ghahramani (2016) - Variational LSTM (medium)Test perplexity79.7Unverified
7LSTM (Bai et al., 2018)Test perplexity78.93Unverified
8Zaremba et al. (2014) - LSTM (large)Test perplexity78.4Unverified
9Gal & Ghahramani (2016) - Variational LSTM (large)Test perplexity75.2Unverified
10Inan et al. (2016) - Variational RHNTest perplexity66Unverified
#ModelMetricClaimedVerifiedStatus
1LSTM (7 layers)Bit per Character (BPC)1.67Unverified
2HypernetworksBit per Character (BPC)1.34Unverified
3SHA-LSTM (4 layers, h=1024, no attention head)Bit per Character (BPC)1.33Unverified
4LN HM-LSTMBit per Character (BPC)1.32Unverified
5ByteNetBit per Character (BPC)1.31Unverified
6Recurrent Highway NetworksBit per Character (BPC)1.27Unverified
7Large FS-LSTM-4Bit per Character (BPC)1.25Unverified
8Large mLSTMBit per Character (BPC)1.24Unverified
9AWD-LSTM (3 layers)Bit per Character (BPC)1.23Unverified
10Cluster-Former (#C=512)Bit per Character (BPC)1.22Unverified
#ModelMetricClaimedVerifiedStatus
1Smaller Transformer 126M (pre-trained)Test perplexity33Unverified
2OPT 125MTest perplexity32.26Unverified
3Larger Transformer 771M (pre-trained)Test perplexity28.1Unverified
4OPT 1.3BTest perplexity19.55Unverified
5GPT-Neo 125MTest perplexity17.83Unverified
6OPT 2.7BTest perplexity17.81Unverified
7Smaller Transformer 126M (fine-tuned)Test perplexity12Unverified
8GPT-Neo 1.3BTest perplexity11.46Unverified
9Transformer 125MTest perplexity10.7Unverified
10GPT-Neo 2.7BTest perplexity10.44Unverified