GraphGPT: Graph Learning with Generative Pre-trained Transformers

2023-12-31Code Available1· sign in to hype

Qifang Zhao, Weidong Ren, Tianyu Li, Xiaoxiao Xu, Hong Liu

Code Available — Be the first to reproduce this paper.

Code

github.com/alibaba/graph-gpt
OfficialIn paperpytorch★ 102

Abstract

We introduce GraphGPT, a novel model for Graph learning by self-supervised Generative Pre-training Transformers. Our model transforms each graph or sampled subgraph into a sequence of tokens representing the node, edge and attributes reversibly using the Eulerian path first. Then we feed the tokens into a standard transformer decoder and pre-train it with the next-token-prediction (NTP) task. Lastly, we fine-tune the GraphGPT model with the supervised tasks. This intuitive, yet effective model achieves superior or close results to the state-of-the-art methods for the graph-, edge- and node-level tasks on the large scale molecular dataset PCQM4Mv2, the protein-protein association dataset ogbl-ppa and the ogbn-proteins dataset from the Open Graph Benchmark (OGB). Furthermore, the generative pre-training enables us to train GraphGPT up to 400M+ parameters with consistently increasing performance, which is beyond the capability of GNNs and previous graph transformers. The source code and pre-trained checkpoints will be released soon https://github.com/alibaba/graph-gpt to pave the way for the graph foundation model research, and also to assist the scientific discovery in pharmaceutical, chemistry, material and bio-informatics domains, etc.

Tasks

Decoder Graph Learning scientific discovery

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ogbl-citation2	GraphGPT(d1n30)	Number of params	133,096,832	—	Unverified
ogbl-citation2	GraphGPT(SMTP)	Number of params	46,784,128	—	Unverified
ogbl-ppa	GraphGPT(SMTP)	Number of params	145,263,360	—	Unverified

GraphGPT: Graph Learning with Generative Pre-trained Transformers

Code

Abstract

Tasks

Benchmark Results

Reproductions