Emblaze: Illuminating Machine Learning Representations through Interactive Comparison of Embedding Spaces

2022-02-05Code Available1· sign in to hype

Venkatesh Sivaraman, Yiwei Wu, Adam Perer

Code Available — Be the first to reproduce this paper.

Code

github.com/cmudig/emblaze
OfficialIn papertf★ 117

Abstract

Modern machine learning techniques commonly rely on complex, high-dimensional embedding representations to capture underlying structure in the data and improve performance. In order to characterize model flaws and choose a desirable representation, model builders often need to compare across multiple embedding spaces, a challenging analytical task supported by few existing tools. We first interviewed nine embedding experts in a variety of fields to characterize the diverse challenges they face and techniques they use when analyzing embedding spaces. Informed by these perspectives, we developed a novel system called Emblaze that integrates embedding space comparison within a computational notebook environment. Emblaze uses an animated, interactive scatter plot with a novel Star Trail augmentation to enable visual comparison. It also employs novel neighborhood analysis and clustering procedures to dynamically suggest groups of points with interesting changes between spaces. Through a series of case studies with ML experts, we demonstrate how interactive comparison with Emblaze can help gain new insights into embedding space structure.

Tasks

BIG-bench Machine Learning

Emblaze: Illuminating Machine Learning Representations through Interactive Comparison of Embedding Spaces

Code

Abstract

Tasks

Reproductions