Open Graph Benchmark: Datasets for Machine Learning on Graphs

2020-05-02NeurIPS 2020Code Available2· sign in to hype

Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, Jure Leskovec

Code Available — Be the first to reproduce this paper.

Code

github.com/snap-stanford/ogb
OfficialIn paperpytorch★ 2,076
github.com/ant-research/KnowledgeGraphEmbeddingsViaPairedRelationVectors_PairRE
pytorch★ 79
github.com/alipay/KnowledgeGraphEmbeddingsViaPairedRelationVectors_PairRE
pytorch★ 79
github.com/anonymous20221001/sieg_ogb
pytorch★ 19
github.com/destwang/InterHT
pytorch★ 11
github.com/xyznlp/trans
pytorch★ 7
github.com/ytbai/ogbn_arxiv_dgl
pytorch★ 5
github.com/jingsonglv/CFG
pytorch★ 2
github.com/chatterjeeayan/upna
pytorch★ 1
github.com/jerermyyoung/AutoWeird
pytorch★ 1

Abstract

We present the Open Graph Benchmark (OGB), a diverse set of challenging and realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine learning (ML) research. OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains, ranging from social and information networks to biological networks, molecular graphs, source code ASTs, and knowledge graphs. For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics. In addition to building the datasets, we also perform extensive benchmark experiments for each dataset. Our experiments suggest that OGB datasets present significant challenges of scalability to large-scale graphs and out-of-distribution generalization under realistic data splits, indicating fruitful opportunities for future research. Finally, OGB provides an automated end-to-end graph ML pipeline that simplifies and standardizes the process of graph data loading, experimental setup, and model evaluation. OGB will be regularly updated and welcomes inputs from the community. OGB datasets as well as data loaders, evaluation scripts, baseline code, and leaderboards are publicly available at https://ogb.stanford.edu .

Tasks

Knowledge Graphs Node Property Prediction

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ogbl-citation2	Matrix Factorization	Number of params	281,113,505	—	Unverified
ogbl-collab	Matrix Factorization	Number of params	60,514,049	—	Unverified
ogbl-ddi	Matrix Factorization	Number of params	1,224,193	—	Unverified
ogbl-ppa	Matrix Factorization	Number of params	147,662,849	—	Unverified

Open Graph Benchmark: Datasets for Machine Learning on Graphs

Code

Abstract

Tasks

Benchmark Results

Reproductions