Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient Modeling
CoRR(2024)
摘要
A Knowledge Graph (KG) is a heterogeneous graph encompassing a diverse range
of node and edge types. Heterogeneous Graph Neural Networks (HGNNs) are popular
for training machine learning tasks like node classification and link
prediction on KGs. However, HGNN methods exhibit excessive complexity
influenced by the KG's size, density, and the number of node and edge types. AI
practitioners handcraft a subgraph of a KG G relevant to a specific task. We
refer to this subgraph as a task-oriented subgraph (TOSG), which contains a
subset of task-related node and edge types in G. Training the task using TOSG
instead of G alleviates the excessive computation required for a large KG.
Crafting the TOSG demands a deep understanding of the KG's structure and the
task's objectives. Hence, it is challenging and time-consuming. This paper
proposes KG-TOSA, an approach to automate the TOSG extraction for task-oriented
HGNN training on a large KG. In KG-TOSA, we define a generic graph pattern that
captures the KG's local and global structure relevant to a specific task. We
explore different techniques to extract subgraphs matching our graph pattern:
namely (i) two techniques sampling around targeted nodes using biased random
walk or influence scores, and (ii) a SPARQL-based extraction method leveraging
RDF engines' built-in indices. Hence, it achieves negligible preprocessing
overhead compared to the sampling techniques. We develop a benchmark of real
KGs of large sizes and various tasks for node classification and link
prediction. Our experiments show that KG-TOSA helps state-of-the-art HGNN
methods reduce training time and memory usage by up to 70
model performance, e.g., accuracy and inference time.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要