Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies

Discover the intricacies of oversmoothing in graph neural networks, including theoretical studies on APVMP and Page Rank propagation, attention mechanisms, and the role of non-linearities and activation functions in mitigating this key challenge.

Key takeaways
  • Oversmoothing is a fundamental problem in graph neural networks (GNNs), where increasing depth leads to nodes becoming indistinguishable.
  • APVMP (Angle-based Pointwise Message Passing) and Page Rank propagation can solve Oversmoothing to some extent, but only for a fixed number of layers.
  • The optimal depth of GNNs is usually shallow (2-5 layers), and optimal performance is reached at a finite number of layers, which is not necessarily linearly scalable.
  • Attention mechanisms cannot prevent Oversmoothing, but can instead slow down its convergence rate.
  • The joint spectral radius of a family of row stochastic matrices determines the convergence rate of GNNs, which is closely related to Oversmoothing.
  • In the asymptotic region, the convergence rate of GNNs is upper bounded by a constant.
  • Different nonlinearities and activation functions, such as ReLU and Leaky Relu, can affect Oversmoothing differently.
  • The optimal performance of GNNs is often attained at a shallower depth, not necessarily correlated with the number of neurons or nodes in the graph.
  • Stochastic block models can be used to test the effect of a finite number of graph convolutions.
  • The variance of GNN outputs decreases exponentially for a wide range of nonlinearities.