Word Sense Disambiguation
The task of Word Sense Disambiguation (WSD) consists of associating words in context with their most suitable entry in a pre-defined sense inventory. The de-facto sense inventory for English in WSD is WordNet.. For example, given the word “mouse” and the following sentence:
“A mouse consists of an object held in one's hand, with one or more buttons.”
we would assign “mouse” with its electronic device sense (the 4th sense in the WordNet sense inventory).
Papers
Showing 1–10 of 1035 papers
All datasetsWords in ContextSupervised:RUSSESemEval 2013 Task 12Senseval-2SensEval 3 Task 1SemEval 2007 Task 7SemEval 2007 Task 17FEWSWiC-TSVBIG-bench (Anachronisms)Knowledge-based:
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | COSINE + Transductive Learning | Accuracy | 85.3 | — | Unverified |
| 2 | PaLM 540B (finetuned) | Accuracy | 78.8 | — | Unverified |
| 3 | ST-MoE-32B 269B (fine-tuned) | Accuracy | 77.7 | — | Unverified |
| 4 | DeBERTa-Ensemble | Accuracy | 77.5 | — | Unverified |
| 5 | Vega v2 6B (fine-tuned) | Accuracy | 77.4 | — | Unverified |
| 6 | UL2 20B (fine-tuned) | Accuracy | 77.3 | — | Unverified |
| 7 | Turing NLR v5 XXL 5.4B (fine-tuned) | Accuracy | 77.1 | — | Unverified |
| 8 | T5-XXL 11B | Accuracy | 76.9 | — | Unverified |
| 9 | DeBERTa-1.5B | Accuracy | 76.4 | — | Unverified |
| 10 | ST-MoE-L 4.1B (fine-tuned) | Accuracy | 74 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SANDWiCH | Senseval 2 | 87.8 | — | Unverified |
| 2 | GlossGPT | Senseval 2 | 86.1 | — | Unverified |
| 3 | ConSeC+WNGC | Senseval 2 | 82.7 | — | Unverified |
| 4 | ESR+WNGC | Senseval 2 | 82.5 | — | Unverified |
| 5 | ConSeC | Senseval 2 | 82.3 | — | Unverified |
| 6 | ESCHER SemCor | Senseval 2 | 81.7 | — | Unverified |
| 7 | ESR | Senseval 2 | 81.3 | — | Unverified |
| 8 | EWISER+WNGC | Senseval 2 | 80.8 | — | Unverified |
| 9 | SemCor+WNGC, hypernyms | Senseval 2 | 79.7 | — | Unverified |
| 10 | SparseLMMS+WNGC | Senseval 2 | 79.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Human Benchmark | Accuracy | 0.81 | — | Unverified |
| 2 | ruT5-large-finetune | Accuracy | 0.74 | — | Unverified |
| 3 | RuBERT conversational | Accuracy | 0.73 | — | Unverified |
| 4 | RuBERT plain | Accuracy | 0.73 | — | Unverified |
| 5 | ruRoberta-large finetune | Accuracy | 0.72 | — | Unverified |
| 6 | ruBert-base finetune | Accuracy | 0.71 | — | Unverified |
| 7 | Multilingual Bert | Accuracy | 0.69 | — | Unverified |
| 8 | ruT5-base-finetune | Accuracy | 0.68 | — | Unverified |
| 9 | ruBert-large finetune | Accuracy | 0.68 | — | Unverified |
| 10 | SBERT_Large_mt_ru_finetuning | Accuracy | 0.66 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SemCor+WNGC, hypernyms | F1 | 78.7 | — | Unverified |
| 2 | SemCor+WNGT, vocabulary reduced, ensemble | F1 | 72.63 | — | Unverified |
| 3 | LSTMLP (T:SemCor, U:1K) | F1 | 69.5 | — | Unverified |
| 4 | LSTMLP (T:OMSTI, U:1K) | F1 | 68.1 | — | Unverified |
| 5 | LSTMLP (T:SemCor, U:OMSTI) | F1 | 67.9 | — | Unverified |
| 6 | LSTM (T:OMSTI) | F1 | 67.3 | — | Unverified |
| 7 | GASext (Concatenation) | F1 | 67.2 | — | Unverified |
| 8 | GASext (Linear) | F1 | 67.1 | — | Unverified |
| 9 | GAS (Concatenation) | F1 | 67 | — | Unverified |
| 10 | LSTM (T:SemCor) | F1 | 67 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SemCor+WNGC, hypernyms | F1 | 79.7 | — | Unverified |
| 2 | SemCor+WNGT, vocabulary reduced, ensemble | F1 | 75.15 | — | Unverified |
| 3 | LSTMLP (T:OMSTI, U:1K) | F1 | 74.4 | — | Unverified |
| 4 | LSTMLP (T:SemCor, U:OMSTI) | F1 | 73.9 | — | Unverified |
| 5 | LSTMLP (T:SemCor, U:1K) | F1 | 73.8 | — | Unverified |
| 6 | LSTM (T:SemCor) | F1 | 73.6 | — | Unverified |
| 7 | GASext (Linear) | F1 | 72.4 | — | Unverified |
| 8 | LSTM (T:OMSTI) | F1 | 72.4 | — | Unverified |
| 9 | GASext (Concatenation) | F1 | 72.2 | — | Unverified |
| 10 | GAS (Concatenation) | F1 | 72.1 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SemCor+WNGC, hypernyms | F1 | 77.8 | — | Unverified |
| 2 | LSTMLP (T:SemCor, U:1K) | F1 | 71.8 | — | Unverified |
| 3 | LSTMLP (T:SemCor, U:OMSTI) | F1 | 71.1 | — | Unverified |
| 4 | LSTMLP (T:OMSTI, U:1K) | F1 | 71 | — | Unverified |
| 5 | GASext (Concatenation) | F1 | 70.5 | — | Unverified |
| 6 | GAS (Concatenation) | F1 | 70.2 | — | Unverified |
| 7 | SemCor+WNGT, vocabulary reduced, ensemble | F1 | 70.11 | — | Unverified |
| 8 | GASext (Linear) | F1 | 70.1 | — | Unverified |
| 9 | GAS (Linear) | F1 | 70 | — | Unverified |
| 10 | LSTM (T:SemCor) | F1 | 69.2 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SemCor+WNGC, hypernyms | F1 | 90.4 | — | Unverified |
| 2 | SemCor+WNGT, vocabulary reduced, ensemble | F1 | 86.02 | — | Unverified |
| 3 | kNN-BERT + POS (training corpus: WNGT) | F1 | 85.32 | — | Unverified |
| 4 | LSTMLP (T:SemCor, U:OMSTI) | F1 | 84.3 | — | Unverified |
| 5 | LSTMLP (T:SemCor, U:1K) | F1 | 83.6 | — | Unverified |
| 6 | LSTMLP (T:OMSTI, U:1K) | F1 | 83.3 | — | Unverified |
| 7 | LSTM (T:SemCor) | F1 | 82.8 | — | Unverified |
| 8 | ShotgunWSD 2.0 | F1 | 81.22 | — | Unverified |
| 9 | kNN-BERT | F1 | 81.2 | — | Unverified |
| 10 | LSTM (T:OMSTI) | F1 | 81.1 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SemCor+WNGC, hypernyms | F1 | 73.4 | — | Unverified |
| 2 | SemCor+WNGT, vocabulary reduced, ensemble | F1 | 66.81 | — | Unverified |
| 3 | LSTM (T:SemCor) | F1 | 64.2 | — | Unverified |
| 4 | LSTMLP (T:SemCor, U:OMSTI) | F1 | 63.7 | — | Unverified |
| 5 | LSTMLP (T:SemCor, U:1K) | F1 | 63.5 | — | Unverified |
| 6 | LSTMLP (T:OMSTI, U:1K) | F1 | 63.3 | — | Unverified |
| 7 | kNN-BERT + POS (training corpus: SemCor) | F1 | 63.17 | — | Unverified |
| 8 | kNN-BERT | F1 | 60.94 | — | Unverified |
| 9 | LSTM (T:OMSTI) | F1 | 60.7 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | GlossGPT | F1 (Zeroshot Dev) | 81.8 | — | Unverified |
| 2 | ESR Large | F1 (Zeroshot Dev) | 77.4 | — | Unverified |
| 3 | ESR base | F1 (Zeroshot Dev) | 73.9 | — | Unverified |
| 4 | SEMEq Large | F1 (Zeroshot Dev) | 73.7 | — | Unverified |
| 5 | SEMeq base | F1 (Zeroshot Dev) | 71.5 | — | Unverified |
| 6 | RTWE large | F1 (Zero shot test) | 69.9 | — | Unverified |
| 7 | Lesk | F1 (Zeroshot Dev) | 40.1 | — | Unverified |
| 8 | MFS | F1 (Zeroshot Dev) | 0 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Human | Task 3 Accuracy: all | 85.3 | — | Unverified |
| 2 | transformers | Task 1 Accuracy: all | 77.8 | — | Unverified |
| 3 | CTLR | Task 1 Accuracy: all | 76.8 | — | Unverified |
| 4 | GlossBert-ws | Task 1 Accuracy: all | 75.9 | — | Unverified |
| 5 | Bert-base | Task 1 Accuracy: all | 75.3 | — | Unverified |
| 6 | Unsupervised Bert | Task 1 Accuracy: all | 54.4 | — | Unverified |
| 7 | FastText | Task 1 Accuracy: all | 53.7 | — | Unverified |
| 8 | All true | Task 1 Accuracy: all | 50.8 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Chinchilla-70B (few-shot, k=5) | Accuracy | 69.1 | — | Unverified |
| 2 | Gopher-280B (few-shot, k=5) | Accuracy | 56.4 | — | Unverified |
| 3 | OPT 175B | Accuracy | 49.1 | — | Unverified |
| 4 | GAL 120B (few-shot, k=5) | Accuracy | 48.7 | — | Unverified |
| 5 | GAL 30B (few-shot, k=5) | Accuracy | 47 | — | Unverified |
| 6 | BLOOM 176B | Accuracy | 1.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UKBppr_w2w | Senseval 2 | 68.8 | — | Unverified |
| 2 | KEF | All | 68 | — | Unverified |
| 3 | WSD-TM | All | 66.9 | — | Unverified |
| 4 | Babelfy | All | 65.5 | — | Unverified |
| 5 | WN 1st sense baseline | All | 65.2 | — | Unverified |
| 6 | UKBppr_w2w-nf | All | 57.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SemCor+WNGC, hypernyms | F1 | 82.6 | — | Unverified |
| 2 | SemCor+WNGT, vocabulary reduced, ensemble | F1 | 74.46 | — | Unverified |
| 3 | GASext (Concatenation) | F1 | 72.6 | — | Unverified |
| 4 | GASext (Linear) | F1 | 72.1 | — | Unverified |
| 5 | GAS (Concatenation) | F1 | 71.8 | — | Unverified |
| 6 | GAS (Linear) | F1 | 71.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | kNN-BERT | F1 | 80.12 | — | Unverified |
| 2 | IMS + adapted CW | F1 | 73.4 | — | Unverified |
| 3 | BiLSTM with GloVe | F1 | 73.4 | — | Unverified |
| 4 | Single BiLSTM | F1 | 72.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | kNN-BERT | F1 | 76.52 | — | Unverified |
| 2 | BiLSTM with GloVe | F1 | 66.9 | — | Unverified |
| 3 | IMS + adapted CW | F1 | 66.2 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SPIN | Sequence Recovery %(All) | 30.3 | — | Unverified |