Efficient Distributed Graph Neural Network Training with Source Chunking and Moving Aggregation.
Published in TKDE, 2025
Graph neural networks (GNNs) are effective models for analyzing graph-structured data, but encounter challenges when training on large distributed graphs. Existing GNN training frameworks use sampling parallelism and historical embedding methods to support distributed training and enhance efficiency. However, these methods suffer from issues like stale historical embeddings, imbalanced communication messages, and redundant storage and computation costs. In this paper, we present Emma, a distributed GNN training framework that incorporates source node centric chunking for frequent updates of embeddings and balanced communication, as well as a moving message aggregation technique to boost training efficiency and reduce storage costs. Experimental results show that Emma significantly enhances training efficiency by reducing computation and communication overhead, leading to a notable speedup while maintaining convergence accuracy compared to state-of-the-art distributed GNN training methods.
Recommended citation: Wenjie Huang, Tongya Zheng, Rui Wang*, Tongtian Zhu, Bingde Hu, Shuibing He, Mingli Song, Xinyu Wang, Sai Wu, Chun Chen. "Efficient Distributed Graph Neural Network Training with Source Chunking and Moving Aggregation." IEEE Transactions on Knowledge and Data Engineering (TKDE) 2025