Natural Language Inference

Natural language inference (NLI) is the task of determining whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".

Example:

| Premise | Label | Hypothesis | | --- | ---| --- | | A man inspects the uniform of a figure in some East Asian country. | contradiction | The man is sleeping. | | An older and younger man smiling. | neutral | Two men are smiling and laughing at the cats playing on the floor. | | A soccer game with multiple males playing. | entailment | Some men are playing a sport. |

Approaches used for NLI include earlier symbolic and statistical approaches to more recent deep learning approaches. Benchmark datasets used for NLI include SNLI, MultiNLI, SciTail, among others. You can get hands-on practice on the SNLI task by following this d2l.ai chapter.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–600 of 1961 papers

Title	Date	Tasks	Status
Emulating the Human Mind: A Neural-symbolic Link Prediction Model with Fast and Slow Reasoning and Filtered Rules	Oct 21, 2023	Knowledge GraphsLink Prediction	—Unverified
Explaining Interactions Between Text Spans	Oct 20, 2023	Community DetectionDecision Making	CodeCode Available
Ecologically Valid Explanations for Label Variation in NLI	Oct 20, 2023	Natural Language Inferencevalid	CodeCode Available
Eliminating Reasoning via Inferring with Planning: A New Framework to Guide LLMs' Non-linear Thinking	Oct 18, 2023	Natural Language Inference	—Unverified
Investigating semantic subspaces of Transformer sentence embeddings through linear structural probing	Oct 18, 2023	DecoderNatural Language Inference	CodeCode Available
Large Language Models can Contrastively Refine their Generation for Better Sentence Representation Learning	Oct 17, 2023	Contrastive LearningNatural Language Inference	CodeCode Available
Experimenting AI Technologies for Disinformation Combat: the IDMO Project	Oct 17, 2023	Natural Language Inference	—Unverified
Calibrating Likelihoods towards Consistency in Summarization Models	Oct 12, 2023	Abstractive Text SummarizationNatural Language Inference	—Unverified
Unlock the Potential of Counterfactually-Augmented Data in Out-Of-Distribution Generalization	Oct 10, 2023	AttributeNatural Language Inference	—Unverified
A Formalism and Approach for Improving Robustness of Large Language Models Using Risk-Adjusted Confidence Scores	Oct 5, 2023	Natural Language Inference	—Unverified
Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification	Sep 24, 2023	Language ModelingLanguage Modelling	CodeCode Available
BenLLMEval: A Comprehensive Evaluation into the Potentials and Pitfalls of Large Language Models on Bengali NLP	Sep 22, 2023	Abstractive Text SummarizationNatural Language Inference	—Unverified
Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels	Sep 18, 2023	AllNatural Language Inference	CodeCode Available
SplitEE: Early Exit in Deep Neural Networks with Split Computing	Sep 17, 2023	Natural Language InferenceParaphrase Identification	CodeCode Available
X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs	Sep 16, 2023	Fact CheckingMachine Translation	CodeCode Available
Rethinking STS and NLI in Large Language Models	Sep 16, 2023	Natural Language InferenceSemantic Textual Similarity	—Unverified
Self-Consistent Narrative Prompts on Abductive Natural Language Inference	Sep 15, 2023	Language ModelingLanguage Modelling	CodeCode Available
Comparative Analysis of Contextual Relation Extraction based on Deep Learning Models	Sep 13, 2023	Deep LearningNatural Language Inference	—Unverified
OYXOY: A Modern NLP Test Suite for Modern Greek	Sep 13, 2023	Natural Language InferenceWord Sense Disambiguation	CodeCode Available
Black-Box Analysis: GPTs Across Time in Legal Textual Entailment Task	Sep 11, 2023	Natural Language Inference	—Unverified
EPA: Easy Prompt Augmentation on Large Language Models via Multiple Sources and Multiple Targets	Sep 9, 2023	In-Context LearningMachine Translation	—Unverified
LanSER: Language-Model Supported Speech Emotion Recognition	Sep 7, 2023	Automatic Speech RecognitionEmotion Recognition	—Unverified
A deep Natural Language Inference predictor without language-specific training data	Sep 6, 2023	Aspect-Based Sentiment AnalysisKnowledge Distillation	—Unverified
Exploiting Language Models as a Source of Knowledge for Cognitive Agents	Sep 5, 2023	Natural Language InferenceQuestion Answering	—Unverified
BatchPrompt: Accomplish more with less	Sep 1, 2023	8kLanguage Modelling	CodeCode Available
Link Prediction for Wikipedia Articles as a Natural Language Inference Task	Aug 31, 2023	ArticlesLink Prediction	CodeCode Available
Lightweight Adaptation of Neural Language Models via Subspace Embedding	Aug 16, 2023	DiversityNatural Language Inference	CodeCode Available
Leveraging Codebook Knowledge with NLI and ChatGPT for Zero-Shot Political Relation Classification	Aug 15, 2023	ClassificationNatural Language Inference	CodeCode Available
Towards Controllable Natural Language Inference through Lexical Inference Types	Aug 7, 2023	Abstract Meaning RepresentationNatural Language Inference	—Unverified
Improving Domain-Specific Retrieval by NLI Fine-Tuning	Aug 6, 2023	Information RetrievalNatural Language Inference	—Unverified
An Overview Of Temporal Commonsense Reasoning and Acquisition	Jul 28, 2023	Common Sense ReasoningLanguage Modelling	—Unverified
ARC-NLP at PAN 2023: Transition-Focused Natural Language Inference for Writing Style Detection	Jul 27, 2023	ARCNatural Language Inference	—Unverified
Improving Natural Language Inference in Arabic using Transformer Models and Linguistically Informed Pre-Training	Jul 27, 2023	named-entity-recognitionNamed Entity Recognition	CodeCode Available
Is Prompt-Based Finetuning Always Better than Vanilla Finetuning? Insights from Cross-Lingual Language Understanding	Jul 15, 2023	Cross-Lingual TransferNatural Language Inference	CodeCode Available
Improving Zero-shot Relation Classification via Automatically-acquired Entailment Templates	Jul 13, 2023	Natural Language InferenceRelation	—Unverified
Synthetic Dataset for Evaluating Complex Compositional Knowledge for Natural Language Inference	Jul 11, 2023	Natural Language InferenceNegation	CodeCode Available
LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias	Jul 6, 2023	Data AugmentationNatural Language Inference	CodeCode Available
NatLogAttack: A Framework for Attacking Natural Language Inference Models with Natural Logic	Jul 6, 2023	Natural Language Inference	—Unverified
SpaceNLI: Evaluating the Consistency of Predicting Inferences in Space	Jul 5, 2023	Natural Language InferenceNegation	CodeCode Available
Evaluating Paraphrastic Robustness in Textual Entailment Models	Jun 29, 2023	Natural Language InferenceRTE	—Unverified
Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating Generalization Capacity of Language Models	Jun 19, 2023	Natural Language Inference	CodeCode Available
No Strong Feelings One Way or Another: Re-operationalizing Neutrality in Natural Language Inference	Jun 16, 2023	Natural Language Inference	—Unverified
Pushing the Limits of ChatGPT on NLP Tasks	Jun 16, 2023	Dependency ParsingEvent Extraction	—Unverified
Neural models for Factual Inconsistency Classification with Explanations	Jun 15, 2023	8kClassification	CodeCode Available
FLamE: Few-shot Learning from Natural Language Explanations	Jun 13, 2023	ClassificationFew-Shot Learning	—Unverified
NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing	Jun 8, 2023	Multi-Task LearningNatural Language Inference	—Unverified
Analysis of the Fed's communication by using textual entailment model of Zero-Shot classification	Jun 7, 2023	Natural Language InferenceSentiment Analysis	—Unverified
PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts	Jun 7, 2023	Cross-Lingual Paraphrase IdentificationMachine Translation	—Unverified
Can current NLI systems handle German word order? Investigating language model performance on a new German challenge set of minimal pairs	Jun 7, 2023	Data AugmentationLanguage Modeling	CodeCode Available
Evaluating the Effectiveness of Natural Language Inference for Hate Speech Detection in Languages with Limited Labeled Data	Jun 6, 2023	Hate Speech DetectionNatural Language Inference	CodeCode Available

Show:10 25 50

← PrevPage 12 of 40Next →

All datasets SNLI RTE MultiNLI QNLI ANLI test WNLI LiDiRus RCB TERRa CommitmentBank SciTail FarsTail

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	UnitedSynT5 (3B)	% Test Accuracy	94.7	—	Unverified
2	UnitedSynT5 (335M)	% Test Accuracy	93.5	—	Unverified
3	EFL (Entailment as Few-shot Learner) + RoBERTa-large	% Test Accuracy	93.1	—	Unverified
4	Neural Tree Indexers for Text Understanding	% Test Accuracy	93.1	—	Unverified
5	RoBERTa-large + self-explaining layer	% Test Accuracy	92.3	—	Unverified
6	RoBERTa-large+Self-Explaining	% Test Accuracy	92.3	—	Unverified
7	CA-MTL	% Test Accuracy	92.1	—	Unverified
8	SemBERT	% Test Accuracy	91.9	—	Unverified
9	MT-DNN-SMARTLARGEv0	% Test Accuracy	91.7	—	Unverified
10	MT-DNN-SMART_100%ofTrainingData	Dev Accuracy	91.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Vega v2 6B (KD-based prompt transfer)	Accuracy	96	—	Unverified
2	PaLM 540B (fine-tuned)	Accuracy	95.7	—	Unverified
3	Turing NLR v5 XXL 5.4B (fine-tuned)	Accuracy	94.1	—	Unverified
4	ST-MoE-32B 269B (fine-tuned)	Accuracy	93.5	—	Unverified
5	DeBERTa-1.5B	Accuracy	93.2	—	Unverified
6	MUPPET Roberta Large	Accuracy	92.8	—	Unverified
7	DeBERTaV3large	Accuracy	92.7	—	Unverified
8	T5-XXL 11B (fine-tuned)	Accuracy	92.5	—	Unverified
9	T5-XXL 11B	Accuracy	92.5	—	Unverified
10	UL2 20B (fine-tuned)	Accuracy	92.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UnitedSynT5 (3B)	Matched	92.6	—	Unverified
2	Turing NLR v5 XXL 5.4B (fine-tuned)	Matched	92.6	—	Unverified
3	T5-XXL 11B (fine-tuned)	Matched	92	—	Unverified
4	T5	Matched	92	—	Unverified
5	T5-11B	Mismatched	91.7	—	Unverified
6	T5-3B	Matched	91.4	—	Unverified
7	ALBERT	Matched	91.3	—	Unverified
8	Adv-RoBERTa ensemble	Matched	91.1	—	Unverified
9	DeBERTa (large)	Matched	91.1	—	Unverified
10	SMARTRoBERTa	Dev Matched	91.1	—	Unverified