SOTAVerified

Word Sense Disambiguation

The task of Word Sense Disambiguation (WSD) consists of associating words in context with their most suitable entry in a pre-defined sense inventory. The de-facto sense inventory for English in WSD is WordNet.. For example, given the word “mouse” and the following sentence:

“A mouse consists of an object held in one's hand, with one or more buttons.”

we would assign “mouse” with its electronic device sense (the 4th sense in the WordNet sense inventory).

Papers

Showing 110 of 1035 papers

TitleStatusHype
Semantic similarity estimation for domain specific data using BERT and other techniques0
On Self-improving Token Embeddings0
SANDWiCH: Semantical Analysis of Neighbours for Disambiguating Words in Context ad HocCode0
GlossGPT: GPT for Word Sense Disambiguation using Few-shot Chain-of-Thought PromptingCode0
Probing Semantic Routing in Large Mixture-of-Expert Models0
TreeMatch: A Fully Unsupervised WSD System Using Dependency Knowledge on a Specific Domain0
Fietje: An open, efficient LLM for DutchCode2
Word Sense Linking: Disambiguating Outside the Sandbox0
Can LLMs assist with Ambiguity? A Quantitative Evaluation of various Large Language Models on Word Sense Disambiguation0
Astro-HEP-BERT: A bidirectional language model for studying the meanings of concepts in astrophysics and high energy physics0
Show:102550
← PrevPage 1 of 104Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1COSINE + Transductive LearningAccuracy85.3Unverified
2PaLM 540B (finetuned)Accuracy78.8Unverified
3ST-MoE-32B 269B (fine-tuned)Accuracy77.7Unverified
4DeBERTa-EnsembleAccuracy77.5Unverified
5Vega v2 6B (fine-tuned)Accuracy77.4Unverified
6UL2 20B (fine-tuned)Accuracy77.3Unverified
7Turing NLR v5 XXL 5.4B (fine-tuned)Accuracy77.1Unverified
8T5-XXL 11BAccuracy76.9Unverified
9DeBERTa-1.5BAccuracy76.4Unverified
10ST-MoE-L 4.1B (fine-tuned)Accuracy74Unverified
#ModelMetricClaimedVerifiedStatus
1SANDWiCHSenseval 287.8Unverified
2GlossGPTSenseval 286.1Unverified
3ConSeC+WNGCSenseval 282.7Unverified
4ESR+WNGCSenseval 282.5Unverified
5ConSeCSenseval 282.3Unverified
6ESCHER SemCorSenseval 281.7Unverified
7ESRSenseval 281.3Unverified
8EWISER+WNGCSenseval 280.8Unverified
9SemCor+WNGC, hypernymsSenseval 279.7Unverified
10SparseLMMS+WNGCSenseval 279.6Unverified
#ModelMetricClaimedVerifiedStatus
1Human BenchmarkAccuracy0.81Unverified
2ruT5-large-finetuneAccuracy0.74Unverified
3RuBERT conversationalAccuracy0.73Unverified
4RuBERT plainAccuracy0.73Unverified
5ruRoberta-large finetuneAccuracy0.72Unverified
6ruBert-base finetuneAccuracy0.71Unverified
7Multilingual BertAccuracy0.69Unverified
8ruT5-base-finetuneAccuracy0.68Unverified
9ruBert-large finetuneAccuracy0.68Unverified
10SBERT_Large_mt_ru_finetuningAccuracy0.66Unverified
#ModelMetricClaimedVerifiedStatus
1SemCor+WNGC, hypernymsF178.7Unverified
2SemCor+WNGT, vocabulary reduced, ensembleF172.63Unverified
3LSTMLP (T:SemCor, U:1K)F169.5Unverified
4LSTMLP (T:OMSTI, U:1K)F168.1Unverified
5LSTMLP (T:SemCor, U:OMSTI)F167.9Unverified
6LSTM (T:OMSTI)F167.3Unverified
7GASext (Concatenation)F167.2Unverified
8GASext (Linear)F167.1Unverified
9GAS (Concatenation)F167Unverified
10LSTM (T:SemCor)F167Unverified
#ModelMetricClaimedVerifiedStatus
1SemCor+WNGC, hypernymsF179.7Unverified
2SemCor+WNGT, vocabulary reduced, ensembleF175.15Unverified
3LSTMLP (T:OMSTI, U:1K)F174.4Unverified
4LSTMLP (T:SemCor, U:OMSTI)F173.9Unverified
5LSTMLP (T:SemCor, U:1K)F173.8Unverified
6LSTM (T:SemCor)F173.6Unverified
7GASext (Linear)F172.4Unverified
8LSTM (T:OMSTI)F172.4Unverified
9GASext (Concatenation)F172.2Unverified
10GAS (Concatenation)F172.1Unverified
#ModelMetricClaimedVerifiedStatus
1SemCor+WNGC, hypernymsF177.8Unverified
2LSTMLP (T:SemCor, U:1K)F171.8Unverified
3LSTMLP (T:SemCor, U:OMSTI)F171.1Unverified
4LSTMLP (T:OMSTI, U:1K)F171Unverified
5GASext (Concatenation)F170.5Unverified
6GAS (Concatenation)F170.2Unverified
7SemCor+WNGT, vocabulary reduced, ensembleF170.11Unverified
8GASext (Linear)F170.1Unverified
9GAS (Linear)F170Unverified
10LSTM (T:SemCor)F169.2Unverified
#ModelMetricClaimedVerifiedStatus
1SemCor+WNGC, hypernymsF190.4Unverified
2SemCor+WNGT, vocabulary reduced, ensembleF186.02Unverified
3kNN-BERT + POS (training corpus: WNGT)F185.32Unverified
4LSTMLP (T:SemCor, U:OMSTI)F184.3Unverified
5LSTMLP (T:SemCor, U:1K)F183.6Unverified
6LSTMLP (T:OMSTI, U:1K)F183.3Unverified
7LSTM (T:SemCor)F182.8Unverified
8ShotgunWSD 2.0F181.22Unverified
9kNN-BERTF181.2Unverified
10LSTM (T:OMSTI)F181.1Unverified
#ModelMetricClaimedVerifiedStatus
1SemCor+WNGC, hypernymsF173.4Unverified
2SemCor+WNGT, vocabulary reduced, ensembleF166.81Unverified
3LSTM (T:SemCor)F164.2Unverified
4LSTMLP (T:SemCor, U:OMSTI)F163.7Unverified
5LSTMLP (T:SemCor, U:1K)F163.5Unverified
6LSTMLP (T:OMSTI, U:1K)F163.3Unverified
7kNN-BERT + POS (training corpus: SemCor)F163.17Unverified
8kNN-BERTF160.94Unverified
9LSTM (T:OMSTI)F160.7Unverified
#ModelMetricClaimedVerifiedStatus
1GlossGPTF1 (Zeroshot Dev)81.8Unverified
2ESR LargeF1 (Zeroshot Dev)77.4Unverified
3ESR baseF1 (Zeroshot Dev)73.9Unverified
4SEMEq LargeF1 (Zeroshot Dev)73.7Unverified
5SEMeq baseF1 (Zeroshot Dev)71.5Unverified
6RTWE largeF1 (Zero shot test)69.9Unverified
7LeskF1 (Zeroshot Dev)40.1Unverified
8MFSF1 (Zeroshot Dev)0Unverified
#ModelMetricClaimedVerifiedStatus
1HumanTask 3 Accuracy: all85.3Unverified
2transformersTask 1 Accuracy: all77.8Unverified
3CTLRTask 1 Accuracy: all76.8Unverified
4GlossBert-wsTask 1 Accuracy: all75.9Unverified
5Bert-baseTask 1 Accuracy: all75.3Unverified
6Unsupervised BertTask 1 Accuracy: all54.4Unverified
7FastTextTask 1 Accuracy: all53.7Unverified
8All trueTask 1 Accuracy: all50.8Unverified
#ModelMetricClaimedVerifiedStatus
1Chinchilla-70B (few-shot, k=5)Accuracy69.1Unverified
2Gopher-280B (few-shot, k=5)Accuracy56.4Unverified
3OPT 175BAccuracy49.1Unverified
4GAL 120B (few-shot, k=5)Accuracy48.7Unverified
5GAL 30B (few-shot, k=5)Accuracy47Unverified
6BLOOM 176BAccuracy1.3Unverified
#ModelMetricClaimedVerifiedStatus
1UKBppr_w2wSenseval 268.8Unverified
2KEFAll68Unverified
3WSD-TMAll66.9Unverified
4BabelfyAll65.5Unverified
5WN 1st sense baselineAll65.2Unverified
6UKBppr_w2w-nfAll57.5Unverified
#ModelMetricClaimedVerifiedStatus
1SemCor+WNGC, hypernymsF182.6Unverified
2SemCor+WNGT, vocabulary reduced, ensembleF174.46Unverified
3GASext (Concatenation)F172.6Unverified
4GASext (Linear)F172.1Unverified
5GAS (Concatenation)F171.8Unverified
6GAS (Linear)F171.6Unverified
#ModelMetricClaimedVerifiedStatus
1kNN-BERTF180.12Unverified
2IMS + adapted CWF173.4Unverified
3BiLSTM with GloVeF173.4Unverified
4Single BiLSTMF172.5Unverified
#ModelMetricClaimedVerifiedStatus
1kNN-BERTF176.52Unverified
2BiLSTM with GloVeF166.9Unverified
3IMS + adapted CWF166.2Unverified
#ModelMetricClaimedVerifiedStatus
1SPINSequence Recovery %(All)30.3Unverified