Learning on Large-scale Text-attributed Graphs via Variational Inference

2022-10-26Code Available1· sign in to hype

Jianan Zhao, Meng Qu, Chaozhuo Li, Hao Yan, Qian Liu, Rui Li, Xing Xie, Jian Tang

Code Available — Be the first to reproduce this paper.

Code

github.com/andyjzhao/glem
OfficialIn paperpytorch★ 134
github.com/AndyJZhao/GLEM
pytorch★ 134

Abstract

This paper studies learning on text-attributed graphs (TAGs), where each node is associated with a text description. An ideal solution for such a problem would be integrating both the text and graph structure information with large language models and graph neural networks (GNNs). However, the problem becomes very challenging when graphs are large due to the high computational complexity brought by training large language models and GNNs together. In this paper, we propose an efficient and effective solution to learning on large text-attributed graphs by fusing graph structure and language learning with a variational Expectation-Maximization (EM) framework, called GLEM. Instead of simultaneously training large language models and GNNs on big graphs, GLEM proposes to alternatively update the two modules in the E-step and M-step. Such a procedure allows training the two modules separately while simultaneously allowing the two modules to interact and mutually enhance each other. Extensive experiments on multiple data sets demonstrate the efficiency and effectiveness of the proposed approach.

Tasks

Variational Inference

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ogbn-arxiv	GLEM+RevGAT	Number of params	140,469,624	—	Unverified
ogbn-papers100M	GLEM+GIANT+GAMLP	Number of params	154,775,375	—	Unverified
ogbn-products	GLEM+EnGCN	Number of params	139,633,805	—	Unverified
ogbn-products	GLEM+GIANT+SAGN+SCR	Number of params	139,792,525	—	Unverified

Learning on Large-scale Text-attributed Graphs via Variational Inference

Code

Abstract

Tasks

Benchmark Results

Reproductions