Word Embeddings

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers.

Techniques for learning word embeddings can include Word2Vec, GloVe, and other neural network-based approaches that train on an NLP task such as language modeling or document classification.

( Image credit: Dynamic Word Embedding for Evolving Semantic Discovery )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 4002 papers

Title	Date	Tasks	Status	Hype
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models	Oct 18, 2022	Language ModellingSentence	CodeCode Available	8
CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models	Apr 24, 2024	Consistent Character GenerationWord Embeddings	CodeCode Available	3
WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering	Jul 8, 2024	DiagnosticGenerative Visual Question Answering	CodeCode Available	2
FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model	May 28, 2024	RelationTopic Models	CodeCode Available	2
VNLP: Turkish NLP Package	Mar 2, 2024	Morphological Analysisnamed-entity-recognition	CodeCode Available	2
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling	Oct 14, 2023	Speech Synthesistext-to-speech	CodeCode Available	2
RETVec: Resilient and Efficient Text Vectorizer	Feb 18, 2023	Adversarial TextMetric Learning	CodeCode Available	2
Contextual Semantic Embeddings for Ontology Subsumption Prediction	Feb 20, 2022	Knowledge Graph EmbeddingsLanguage Modeling	CodeCode Available	2
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation	Aug 27, 2021	Inductive BiasPlaying the Game of 2048	CodeCode Available	2
A Pilot Study for Chinese SQL Semantic Parsing	Sep 29, 2019	Cross-Lingual Word EmbeddingsQuestion Answering	CodeCode Available	2
ConceptNet at SemEval-2017 Task 2: Extending Word Embeddings with Multilingual Relational Knowledge	Apr 11, 2017	General KnowledgeMultilingual Word Embeddings	CodeCode Available	2
ConceptNet 5.5: An Open Multilingual Graph of General Knowledge	Dec 12, 2016	General KnowledgeWord Embeddings	CodeCode Available	2
An Ensemble Method to Produce High-Quality Word Embeddings (2016)	Apr 6, 2016	Vocal Bursts Intensity PredictionWord Embeddings	CodeCode Available	2
Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot Learning	Nov 18, 2024	AttributeCompositional Zero-Shot Learning	CodeCode Available	1
HJ-Ky-0.1: an Evaluation Dataset for Kyrgyz Word Embeddings	Nov 16, 2024	Sentiment AnalysisWord Embeddings	CodeCode Available	1
Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia	Oct 7, 2024	Domain GeneralizationLanguage Modeling	CodeCode Available	1
Enhancing High-order Interaction Awareness in LLM-based Recommender Model	Sep 30, 2024	Knowledge GraphsReranking	CodeCode Available	1
GrEmLIn: A Repository of Green Baseline Embeddings for 87 Low-Resource Languages Injected with Multilingual Graph Knowledge	Sep 26, 2024	Natural Language InferenceSentiment Analysis	CodeCode Available	1
DiffEditor: Enhancing Speech Editing with Semantic Enrichment and Acoustic Consistency	Sep 19, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Spiking Convolutional Neural Networks for Text Classification	Jun 27, 2024	Classificationtext-classification	CodeCode Available	1
MT2ST: Adaptive Multi-Task to Single-Task Learning	Jun 26, 2024	Multi-Task LearningWord Embeddings	CodeCode Available	1
Statistical Uncertainty in Word Embeddings: GloVe-V	Jun 18, 2024	Decision MakingModel Selection	CodeCode Available	1
A Comprehensive Analysis of Static Word Embeddings for Turkish	May 13, 2024	Word Embeddings	CodeCode Available	1
AnomalyLLM: Few-shot Anomaly Edge Detection for Dynamic Graphs using Large Language Models	May 13, 2024	Anomaly DetectionEdge Detection	CodeCode Available	1
DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation	Mar 30, 2024	Dataset DistillationIn-Context Learning	CodeCode Available	1
Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings	Feb 9, 2024	Machine TranslationQuantization	CodeCode Available	1
Deep Semantic-Visual Alignment for Zero-Shot Remote Sensing Image Scene Classification	Feb 3, 2024	Attributeimage-classification	CodeCode Available	1
Pre-training and Diagnosing Knowledge Base Completion Models	Jan 27, 2024	General KnowledgeKnowledge Base Completion	CodeCode Available	1
Decoupled Textual Embeddings for Customized Image Generation	Dec 19, 2023	AttributeDisentanglement	CodeCode Available	1
Quantifying the redundancy between prosody and text	Nov 28, 2023	Word Embeddings	CodeCode Available	1
OFA: A Framework of Initializing Unseen Subword Embeddings for Efficient Large-scale Multilingual Continued Pretraining	Nov 15, 2023	Language ModellingMultilingual Word Embeddings	CodeCode Available	1
MLFMF: Data Sets for Machine Learning for Mathematical Formalization	Oct 24, 2023	BenchmarkingRecommendation Systems	CodeCode Available	1
Circumventing Concept Erasure Methods For Text-to-Image Generative Models	Aug 3, 2023	Face SwappingWord Embeddings	CodeCode Available	1
The Looming Threat of Fake and LLM-generated LinkedIn Profiles: Challenges and Opportunities for Detection and Prevention	Jul 21, 2023	Language ModellingLarge Language Model	CodeCode Available	1
Meta-Personalizing Vision-Language Models to Find Named Instances in Video	Jun 16, 2023	RetrievalWord Embeddings	CodeCode Available	1
Towards Fair and Explainable AI using a Human-Centered AI Approach	Jun 12, 2023	FairnessWord Embeddings	CodeCode Available	1
Backpack Language Models	May 26, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
Language Models Implement Simple Word2Vec-style Vector Arithmetic	May 25, 2023	In-Context LearningLanguage Modeling	CodeCode Available	1
MIANet: Aggregating Unbiased Instance and General Information for Few-Shot Semantic Segmentation	May 23, 2023	Few-Shot Semantic SegmentationGeneral Knowledge	CodeCode Available	1
Word Embeddings Are Steers for Language Models	May 22, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
PWESuite: Phonetic Word Embeddings and Tasks They Facilitate	Apr 5, 2023	RetrievalWord Embeddings	CodeCode Available	1
CTRAN: CNN-Transformer-based Network for Natural Language Understanding	Mar 19, 2023	DecoderIntent Detection	CodeCode Available	1
LANDMARK: Language-guided Representation Enhancement Framework for Scene Graph Generation	Mar 2, 2023	Graph GenerationObject	CodeCode Available	1
SanskritShala: A Neural Sanskrit NLP Toolkit with Web-Based Interface for Pedagogical and Annotation Purposes	Feb 19, 2023	Dependency ParsingMorphological Tagging	CodeCode Available	1
Efficient and Flexible Topic Modeling using Pretrained Embeddings and Bag of Sentences	Feb 6, 2023	SentenceSentence Embeddings	CodeCode Available	1
Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts	Dec 12, 2022	Language ModellingWord Embeddings	CodeCode Available	1
Learning Object-Language Alignments for Open-Vocabulary Object Detection	Nov 27, 2022	Objectobject-detection	CodeCode Available	1
ALIGN-MLM: Word Embedding Alignment is Crucial for Multilingual Pre-training	Nov 15, 2022	Cross-Lingual TransferPOS	CodeCode Available	1
Improving word mover's distance by leveraging self-attention matrix	Nov 11, 2022	Paraphrase IdentificationSemantic Similarity	CodeCode Available	1
ADEPT: A DEbiasing PrompT Framework	Nov 10, 2022	AttributeLanguage Modelling	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 81Next →

No leaderboard results yet.