Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs > arXiv:2305.15611

Help | Advanced Search

Computer Science > Machine Learning

(cs)
[Submitted on 24 May 2023 (v1), last revised 7 Feb 2024 (this version, v4)]

Title:Size Generalization of Graph Neural Networks on Biological Data: Insights and Practices from the Spectral Perspective

Authors:Gaotang Li, Danai Koutra, Yujun Yan
Download a PDF of the paper titled Size Generalization of Graph Neural Networks on Biological Data: Insights and Practices from the Spectral Perspective, by Gaotang Li and 2 other authors
Download PDF HTML (experimental)
Abstract:We investigate size-induced distribution shifts in graphs and assess their impact on the ability of graph neural networks (GNNs) to generalize to larger graphs relative to the training data. Existing literature presents conflicting conclusions on GNNs' size generalizability, primarily due to disparities in application domains and underlying assumptions concerning size-induced distribution shifts. Motivated by this, we take a data-driven approach: we focus on real biological datasets and seek to characterize the types of size-induced distribution shifts. Diverging from prior approaches, we adopt a spectral perspective and identify that spectrum differences induced by size are related to differences in subgraph patterns (e.g., average cycle lengths). While previous studies have identified that the inability of GNNs in capturing subgraph information negatively impacts their in-distribution generalization, our findings further show that this decline is more pronounced when evaluating on larger test graphs not encountered during training. Based on these spectral insights, we introduce a simple yet effective model-agnostic strategy, which makes GNNs aware of these important subgraph patterns to enhance their size generalizability. Our empirical results reveal that our proposed size-insensitive attention strategy substantially enhances graph classification performance on large test graphs, which are 2-10 times larger than the training graphs, resulting in an improvement in F1 scores by up to 8%.
Comments: 21 pages, including appendix
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2305.15611 [cs.LG]
  (or arXiv:2305.15611v4 [cs.LG] for this version)
  https://doi.org/10.48550/arXiv.2305.15611
arXiv-issued DOI via DataCite

Submission history

From: Gaotang Li [view email]
[v1] Wed, 24 May 2023 23:01:14 UTC (5,283 KB)
[v2] Fri, 29 Sep 2023 21:51:27 UTC (22,881 KB)
[v3] Tue, 6 Feb 2024 04:15:14 UTC (23,031 KB)
[v4] Wed, 7 Feb 2024 03:27:12 UTC (23,031 KB)
Full-text links:

Access Paper:

    Download a PDF of the paper titled Size Generalization of Graph Neural Networks on Biological Data: Insights and Practices from the Spectral Perspective, by Gaotang Li and 2 other authors
  • Download PDF
  • HTML (experimental)
  • TeX Source
  • Other Formats
view license
Current browse context:
cs.LG
< prev   |   next >
new | recent | 2305
Change to browse by:
cs
cs.AI

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack