Natural Language Inference
Natural language inference (NLI) is the task of determining whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".
Example:
| Premise | Label | Hypothesis | | --- | ---| --- | | A man inspects the uniform of a figure in some East Asian country. | contradiction | The man is sleeping. | | An older and younger man smiling. | neutral | Two men are smiling and laughing at the cats playing on the floor. | | A soccer game with multiple males playing. | entailment | Some men are playing a sport. |
Approaches used for NLI include earlier symbolic and statistical approaches to more recent deep learning approaches. Benchmark datasets used for NLI include SNLI, MultiNLI, SciTail, among others. You can get hands-on practice on the SNLI task by following this d2l.ai chapter.
Further readings:
Papers
Showing 1–10 of 1961 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Human Benchmark | Average F1 | 0.68 | — | Unverified |
| 2 | RuBERT conversational | Average F1 | 0.45 | — | Unverified |
| 3 | RuGPT3Large | Average F1 | 0.42 | — | Unverified |
| 4 | YaLM 1.0B few-shot | Average F1 | 0.41 | — | Unverified |
| 5 | Golden Transformer | Average F1 | 0.41 | — | Unverified |
| 6 | heuristic majority | Average F1 | 0.4 | — | Unverified |
| 7 | RuGPT3Medium | Average F1 | 0.37 | — | Unverified |
| 8 | SBERT_Large | Average F1 | 0.37 | — | Unverified |
| 9 | RuBERT plain | Average F1 | 0.37 | — | Unverified |
| 10 | Multilingual Bert | Average F1 | 0.37 | — | Unverified |