Natural Language Inference
Natural language inference (NLI) is the task of determining whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".
Example:
| Premise | Label | Hypothesis | | --- | ---| --- | | A man inspects the uniform of a figure in some East Asian country. | contradiction | The man is sleeping. | | An older and younger man smiling. | neutral | Two men are smiling and laughing at the cats playing on the floor. | | A soccer game with multiple males playing. | entailment | Some men are playing a sport. |
Approaches used for NLI include earlier symbolic and statistical approaches to more recent deep learning approaches. Benchmark datasets used for NLI include SNLI, MultiNLI, SciTail, among others. You can get hands-on practice on the SNLI task by following this d2l.ai chapter.
Further readings:
Papers
Showing 1–10 of 1961 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UnitedSynT5 (3B) | % Test Accuracy | 94.7 | — | Unverified |
| 2 | UnitedSynT5 (335M) | % Test Accuracy | 93.5 | — | Unverified |
| 3 | EFL (Entailment as Few-shot Learner) + RoBERTa-large | % Test Accuracy | 93.1 | — | Unverified |
| 4 | Neural Tree Indexers for Text Understanding | % Test Accuracy | 93.1 | — | Unverified |
| 5 | RoBERTa-large+Self-Explaining | % Test Accuracy | 92.3 | — | Unverified |
| 6 | RoBERTa-large + self-explaining layer | % Test Accuracy | 92.3 | — | Unverified |
| 7 | CA-MTL | % Test Accuracy | 92.1 | — | Unverified |
| 8 | SemBERT | % Test Accuracy | 91.9 | — | Unverified |
| 9 | MT-DNN-SMARTLARGEv0 | % Test Accuracy | 91.7 | — | Unverified |
| 10 | MT-DNN-SMART_100%ofTrainingData | Dev Accuracy | 91.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Vega v2 6B (KD-based prompt transfer) | Accuracy | 96 | — | Unverified |
| 2 | PaLM 540B (fine-tuned) | Accuracy | 95.7 | — | Unverified |
| 3 | Turing NLR v5 XXL 5.4B (fine-tuned) | Accuracy | 94.1 | — | Unverified |
| 4 | ST-MoE-32B 269B (fine-tuned) | Accuracy | 93.5 | — | Unverified |
| 5 | DeBERTa-1.5B | Accuracy | 93.2 | — | Unverified |
| 6 | MUPPET Roberta Large | Accuracy | 92.8 | — | Unverified |
| 7 | DeBERTaV3large | Accuracy | 92.7 | — | Unverified |
| 8 | T5-XXL 11B | Accuracy | 92.5 | — | Unverified |
| 9 | T5-XXL 11B (fine-tuned) | Accuracy | 92.5 | — | Unverified |
| 10 | ST-MoE-L 4.1B (fine-tuned) | Accuracy | 92.1 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UnitedSynT5 (3B) | Matched | 92.6 | — | Unverified |
| 2 | Turing NLR v5 XXL 5.4B (fine-tuned) | Matched | 92.6 | — | Unverified |
| 3 | T5-XXL 11B (fine-tuned) | Matched | 92 | — | Unverified |
| 4 | T5 | Matched | 92 | — | Unverified |
| 5 | T5-11B | Mismatched | 91.7 | — | Unverified |
| 6 | T5-3B | Matched | 91.4 | — | Unverified |
| 7 | ALBERT | Matched | 91.3 | — | Unverified |
| 8 | DeBERTa (large) | Matched | 91.1 | — | Unverified |
| 9 | Adv-RoBERTa ensemble | Matched | 91.1 | — | Unverified |
| 10 | SMARTRoBERTa | Dev Matched | 91.1 | — | Unverified |