SOTAVerified

AnglE-optimized Text Embeddings

2023-09-22Code Available2· sign in to hype

Xianming Li, Jing Li

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

High-quality text embedding is pivotal in improving semantic textual similarity (STS) tasks, which are crucial components in Large Language Model (LLM) applications. However, a common challenge existing text embedding models face is the problem of vanishing gradients, primarily due to their reliance on the cosine function in the optimization objective, which has saturation zones. To address this issue, this paper proposes a novel angle-optimized text embedding model called AnglE. The core idea of AnglE is to introduce angle optimization in a complex space. This novel approach effectively mitigates the adverse effects of the saturation zone in the cosine function, which can impede gradient and hinder optimization processes. To set up a comprehensive STS evaluation, we experimented on existing short-text STS datasets and a newly collected long-text STS dataset from GitHub Issues. Furthermore, we examine domain-specific STS scenarios with limited labeled data and explore how AnglE works with LLM-annotated data. Extensive experiments were conducted on various tasks including short-text STS, long-text STS, and domain-specific STS tasks. The results show that AnglE outperforms the state-of-the-art (SOTA) STS models that ignore the cosine saturation zone. These findings demonstrate the ability of AnglE to generate high-quality text embeddings and the usefulness of angle optimization in STS.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
MTEBAnglE-UAESpearman Correlation84.54Unverified
SICK-RAnglE-LLaMA-7BSpearman Correlation0.81Unverified
STS12AnglE-LLaMA-13BSpearman Correlation0.79Unverified
STS12AnglE-LLaMA-7BSpearman Correlation0.79Unverified
STS13AnglE-LLaMA-7B-v2Spearman Correlation0.91Unverified
STS13AnglE-LLaMA-7BSpearman Correlation0.91Unverified
STS14AnglE-LLaMA-7B-v2Spearman Correlation0.86Unverified
STS14AnglE-LLaMA-7BSpearman Correlation0.85Unverified
STS14AnglE-LLaMA-13BSpearman Correlation0.87Unverified
STS15AnglE-LLaMA-7B-v2Spearman Correlation0.89Unverified
STS15AnglE-LLaMA-13BSpearman Correlation0.9Unverified
STS16AnglE-LLaMA-13BSpearman Correlation0.87Unverified
STS16AnglE-LLaMA-7B-v2Spearman Correlation0.87Unverified
STS16AnglE-LLaMA-7BSpearman Correlation0.87Unverified
STS BenchmarkAnglE-LLaMA-7B-v2Spearman Correlation0.89Unverified
STS BenchmarkAnglE-LLaMA-13BSpearman Correlation0.9Unverified
STS BenchmarkAnglE-LLaMA-7BSpearman Correlation0.89Unverified

Reproductions