Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies

Yu Rong

Discover the intricacies of oversmoothing in graph neural networks, including theoretical studies on APVMP and Page Rank propagation, attention mechanisms, and the role of non-linearities and activation functions in mitigating this key challenge.

Key takeaways
  • Oversmoothing is a fundamental problem in graph neural networks (GNNs), where increasing depth leads to nodes becoming indistinguishable.
  • APVMP (Angle-based Pointwise Message Passing) and Page Rank propagation can solve Oversmoothing to some extent, but only for a fixed number of layers.
  • The optimal depth of GNNs is usually shallow (2-5 layers), and optimal performance is reached at a finite number of layers, which is not necessarily linearly scalable.
  • Attention mechanisms cannot prevent Oversmoothing, but can instead slow down its convergence rate.
  • The joint spectral radius of a family of row stochastic matrices determines the convergence rate of GNNs, which is closely related to Oversmoothing.
  • In the asymptotic region, the convergence rate of GNNs is upper bounded by a constant.
  • Different nonlinearities and activation functions, such as ReLU and Leaky Relu, can affect Oversmoothing differently.
  • The optimal performance of GNNs is often attained at a shallower depth, not necessarily correlated with the number of neurons or nodes in the graph.
  • Stochastic block models can be used to test the effect of a finite number of graph convolutions.
  • The variance of GNN outputs decreases exponentially for a wide range of nonlinearities.