Inductive Representation Learning on Large Graphs
William L. Hamilton, Rex Ying, Jure Leskovec
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/williamleif/GraphSAGEOfficialtf★ 0
- github.com/massquantity/LibRecommendertf★ 474
- github.com/IllinoisGraphBenchmark/IGB-Datasetspytorch★ 86
- github.com/zxhhh97/ABotpytorch★ 18
- github.com/isotlaboratory/ml4vrppytorch★ 16
- github.com/weiyinwei/huignpytorch★ 10
- github.com/chatterjeeayan/upnapytorch★ 1
- github.com/erfmah/answering_graph_queriespytorch★ 1
- github.com/weiyinwei/mmgcnpytorch★ 0
- github.com/qema/orca-pypytorch★ 0
Abstract
Low-dimensional embeddings of nodes in large graphs have proved extremely useful in a variety of prediction tasks, from content recommendation to identifying protein functions. However, most existing approaches require that all nodes in the graph are present during training of the embeddings; these previous approaches are inherently transductive and do not naturally generalize to unseen nodes. Here we present GraphSAGE, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data. Instead of training individual embeddings for each node, we learn a function that generates embeddings by sampling and aggregating features from a node's local neighborhood. Our algorithm outperforms strong baselines on three inductive node-classification benchmarks: we classify the category of unseen nodes in evolving information graphs based on citation and Reddit post data, and we show that our algorithm generalizes to completely unseen graphs using a multi-graph dataset of protein-protein interactions.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| CIFAR10 100k | GraphSage | Accuracy (%) | 66.08 | — | Unverified |