RWKV: Reinventing RNNs for the Transformer Era May 22, 2023 Computational Efficiency Natural Language Inference
Code Code Available 65 LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale Aug 15, 2022 GPU Language Modelling
Code Code Available 55 Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation Feb 28, 2024 Attribute Extractive Question-Answering
Code Code Available 45 TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models May 18, 2023 Natural Language Inference Synthetic Data Generation
Code Code Available 45 AlignScore: Evaluating Factual Consistency with a Unified Alignment Function May 26, 2023 Fact Verification Information Retrieval
Code Code Available 45 N-Grammer: Augmenting Transformers with latent n-grams Jul 13, 2022 Common Sense Reasoning Coreference Resolution
Code Code Available 45 Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective Oct 16, 2022 Coreference Resolution Multiple-choice
Code Code Available 45 ERNIE 2.0: A Continual Pre-training Framework for Language Understanding Jul 29, 2019 Chinese Named Entity Recognition Chinese Reading Comprehension
Code Code Available 35 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Oct 11, 2018 Citation Intent Classification Common Sense Reasoning
Code Code Available 35 Pre-Training with Whole Word Masking for Chinese BERT Jun 19, 2019 Document Classification General Classification
Code Code Available 35 Finetuned Language Models Are Zero-Shot Learners Sep 3, 2021 ARC Common Sense Reasoning
Code Code Available 35 ERNIE: Enhanced Representation through Knowledge Integration Apr 19, 2019 Chinese Named Entity Recognition Chinese Sentence Pair Classification
Code Code Available 35 ST-MoE: Designing Stable and Transferable Sparse Expert Models Feb 17, 2022 ARC Common Sense Reasoning
Code Code Available 35 Language Models are Few-Shot Learners May 28, 2020 answerability prediction Articles
Code Code Available 35 LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions Apr 27, 2023 Common Sense Reasoning Coreference Resolution
Code Code Available 25 Order Constraints in Optimal Transport Oct 14, 2021 Natural Language Inference
Code Code Available 25 AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model Aug 2, 2022 Causal Language Modeling Common Sense Reasoning
Code Code Available 25 ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models Jul 5, 2024 Hallucination Long Form Question Answering
Code Code Available 25 BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks May 26, 2023 Image Captioning Medical Visual Question Answering
Code Code Available 25 PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain Oct 22, 2023 Dialogue Generation Dialogue Understanding
Code Code Available 25 Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach Aug 31, 2019 Articles Benchmarking
Code Code Available 25 Scientific QA System with Verifiable Answers Jul 16, 2024 Articles Information Retrieval
Code Code Available 25 PaLM: Scaling Language Modeling with Pathways Apr 5, 2022 Auto Debugging Code Generation
Code Code Available 25 ALBERT: A Lite BERT for Self-supervised Learning of Language Representations Sep 26, 2019 Common Sense Reasoning GPU
Code Code Available 25 ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers Sep 28, 2023 GPU Instruction Following
Code Code Available 25 SimCSE: Simple Contrastive Learning of Sentence Embeddings Apr 18, 2021 Contrastive Learning Data Augmentation
Code Code Available 25 I-BERT: Integer-only BERT Quantization Jan 5, 2021 GPU Natural Language Inference
Code Code Available 25 Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale Mar 13, 2024 Constituency Grammar Induction Language Modeling
Code Code Available 25 mGPT: Few-Shot Learners Go Multilingual Apr 15, 2022 Cross-Lingual Natural Language Inference Cross-Lingual Paraphrase Identification
Code Code Available 25 Hungry Hungry Hippos: Towards Language Modeling with State Space Models Dec 28, 2022 8k Coreference Resolution
Code Code Available 25 Ask Me Anything: A simple strategy for prompting language models Oct 5, 2022 Coreference Resolution Natural Language Inference
Code Code Available 25 DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing Nov 18, 2021 Language Modeling Language Modelling
Code Code Available 25 DeBERTa: Decoding-enhanced BERT with Disentangled Attention Jun 5, 2020 Common Sense Reasoning Coreference Resolution
Code Code Available 25 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Oct 23, 2019 Answer Generation Common Sense Reasoning
Code Code Available 25 The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning May 23, 2023 Common Sense Reasoning Common Sense Reasoning (Zero-Shot)
Code Code Available 25 Chain of Natural Language Inference for Reducing Large Language Model Ungrounded Hallucinations Oct 6, 2023 Hallucination Language Modeling
Code Code Available 15 An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models Jul 14, 2020 Diversity Multi-Task Learning
Code Code Available 15 CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters Oct 20, 2020 Clinical Concept Extraction Drug–drug Interaction Extraction
Code Code Available 15 Can NLI Provide Proper Indirect Supervision for Low-resource Biomedical Relation Extraction? Dec 21, 2022 Multi-class Classification Natural Language Inference
Code Code Available 15 Can NLI Models Verify QA Systems’ Predictions? Nov 1, 2021 Natural Language Inference Question Answering
Code Code Available 15 CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark Jun 15, 2021 Intent Classification Medical Concept Normalization
Code Code Available 15 Charformer: Fast Character Transformers via Gradient-based Subword Tokenization Jun 23, 2021 Inductive Bias Linguistic Acceptability
Code Code Available 15 Can Explanations Be Useful for Calibrating Black Box Models? Oct 14, 2021 Extractive Question-Answering Few-Shot Learning
Code Code Available 15 Are self-explanations from Large Language Models faithful? Jan 15, 2024 counterfactual Faithfulness Critic
Code Code Available 15 Calibration of Pre-trained Transformers Mar 17, 2020 Natural Language Inference
Code Code Available 15 Analyzing Multi-Task Learning for Abstractive Text Summarization Oct 26, 2022 Abstractive Text Summarization Multi-Task Learning
Code Code Available 15 A Comparative Study of Pretrained Language Models for Long Clinical Text Jan 27, 2023 Clinical Knowledge Document Classification
Code Code Available 15 CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias Aug 24, 2023 Diversity Language Modeling
Code Code Available 15 Building Efficient Universal Classifiers with Natural Language Inference Dec 29, 2023 Classification Natural Language Inference
Code Code Available 15 Can NLI Models Verify QA Systems' Predictions? Apr 18, 2021 Natural Language Inference Question Answering
Code Code Available 15