SOTAVerified

Language Modelling

A language model is a model of natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form, are predominantly based on transformers trained on larger datasets (frequently using words scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as word n-gram language model.

Source: Wikipedia

Papers

Showing 1050110550 of 17610 papers

TitleStatusHype
Progressive Alignment with VLM-LLM Feature to Augment Defect Classification for the ASE Dataset0
Progressive Class Semantic Matching for Semi-supervised Text Classification0
Progressively Label Enhancement for Large Language Model Alignment0
Projection of Turn Completion in Incremental Spoken Dialogue Systems0
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines0
Promises, Outlooks and Challenges of Diffusion Language Modeling0
Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection0
Prompt-Augmented Linear Probing: Scaling beyond the Limit of Few-shot In-Context Learners0
Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models0
Prompt-based Depth Pruning of Large Language Models0
Prompt-based System for Personality and Interpersonal Reactivity Prediction0
Prompt-based Visual Alignment for Zero-shot Policy Transfer0
Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models0
PromptCrafter: Crafting Text-to-Image Prompt through Mixed-Initiative Dialogue with LLM0
Prompt Distribution Learning0
Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing0
Prompt-driven Universal Model for View-Agnostic Echocardiography Analysis0
Prompter: Utilizing Large Language Model Prompting for a Data Efficient Embodied Instruction Following0
PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines0
Prompt-free and Efficient Language Model Fine-Tuning0
Prompt-Guided Generation of Structured Chest X-Ray Report Using a Pre-trained LLM0
Prompt-Guided Injection of Conformation to Pre-trained Protein Model0
Prompt-Guided Turn-Taking Prediction0
PromptInfuser: How Tightly Coupling AI and UI Design Impacts Designers' Workflows0
Prompting a Large Language Model to Generate Diverse Motivational Messages: A Comparison with Human-Written Messages0
Prompting as Multimodal Fusing0
Prompting for a conversation: How to control a dialog model?0
Prompting for Multimodal Hateful Meme Classification0
Prompting Large Language Model for Machine Translation: A Case Study0
Prompting Large Language Models for Supporting the Differential Diagnosis of Anemia0
Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition0
Prompting Large Language Models With the Socratic Method0
Prompting PaLM for Translation: Assessing Strategies and Performance0
PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning0
Prompt Learning for Domain Adaptation in Task-Oriented Dialogue0
A Dual Prompt Learning Framework for Few-Shot Dialogue State Tracking0
Prompt-Learning for Fine-Grained Entity Typing0
Prompt-Learning for Fine-Grained Entity Typing0
Promptly Yours? A Human Subject Study on Prompt Inference in AI-Generated Art0
Prompt Optimization with Logged Bandit Data0
Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding0
PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics0
Promptor: A Conversational and Autonomous Prompt Generation Agent for Intelligent Text Entry Techniques0
Prompt-oriented Output of Culture-Specific Items in Translated African Poetry by Large Language Model: An Initial Multi-layered Tabular Review0
Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing0
Prompts as Auto-Optimized Training Hyperparameters: Training Best-in-Class IR Models from Scratch with 10 Gold Labels0
Prompt Selection and Augmentation for Few Examples Code Generation in Large Language Model and its Application in Robotics Control0
PromptSum: Parameter-Efficient Controllable Abstractive Summarization0
Prompt-to-OS (P2OS): Revolutionizing Operating Systems and Human-Computer Interaction with Integrated AI Generative Models0
PromptTTS 2: Describing and Generating Voices with Text Prompt0
Show:102550
← PrevPage 211 of 353Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Decay RNNValidation perplexity76.67Unverified
2GRUValidation perplexity53.78Unverified
3LSTMValidation perplexity52.73Unverified
4LSTMTest perplexity48.7Unverified
5Temporal CNNTest perplexity45.2Unverified
6TCNTest perplexity45.19Unverified
7GCNN-8Test perplexity44.9Unverified
8Neural cache model (size = 100)Test perplexity44.8Unverified
9Neural cache model (size = 2,000)Test perplexity40.8Unverified
10GPT-2 SmallTest perplexity37.5Unverified
#ModelMetricClaimedVerifiedStatus
1TCNTest perplexity108.47Unverified
2Seq-U-NetTest perplexity107.95Unverified
3GRU (Bai et al., 2018)Test perplexity92.48Unverified
4R-TransformerTest perplexity84.38Unverified
5Zaremba et al. (2014) - LSTM (medium)Test perplexity82.7Unverified
6Gal & Ghahramani (2016) - Variational LSTM (medium)Test perplexity79.7Unverified
7LSTM (Bai et al., 2018)Test perplexity78.93Unverified
8Zaremba et al. (2014) - LSTM (large)Test perplexity78.4Unverified
9Gal & Ghahramani (2016) - Variational LSTM (large)Test perplexity75.2Unverified
10Inan et al. (2016) - Variational RHNTest perplexity66Unverified
#ModelMetricClaimedVerifiedStatus
1LSTM (7 layers)Bit per Character (BPC)1.67Unverified
2HypernetworksBit per Character (BPC)1.34Unverified
3SHA-LSTM (4 layers, h=1024, no attention head)Bit per Character (BPC)1.33Unverified
4LN HM-LSTMBit per Character (BPC)1.32Unverified
5ByteNetBit per Character (BPC)1.31Unverified
6Recurrent Highway NetworksBit per Character (BPC)1.27Unverified
7Large FS-LSTM-4Bit per Character (BPC)1.25Unverified
8Large mLSTMBit per Character (BPC)1.24Unverified
9AWD-LSTM (3 layers)Bit per Character (BPC)1.23Unverified
10Cluster-Former (#C=512)Bit per Character (BPC)1.22Unverified
#ModelMetricClaimedVerifiedStatus
1Smaller Transformer 126M (pre-trained)Test perplexity33Unverified
2OPT 125MTest perplexity32.26Unverified
3Larger Transformer 771M (pre-trained)Test perplexity28.1Unverified
4OPT 1.3BTest perplexity19.55Unverified
5GPT-Neo 125MTest perplexity17.83Unverified
6OPT 2.7BTest perplexity17.81Unverified
7Smaller Transformer 126M (fine-tuned)Test perplexity12Unverified
8GPT-Neo 1.3BTest perplexity11.46Unverified
9Transformer 125MTest perplexity10.7Unverified
10GPT-Neo 2.7BTest perplexity10.44Unverified