The first universal framework in the graph field is here! Selected as ICLR'24 Spotlight, any data set or classification problem can be solved｜From University of Washington & Peking University & JD.com

Latest update time：2024-02-03

Reads：

Fengse comes from Ao Fei Temple
Qubit | Official account QbitAI

Can there be a general graph model——

Can it predict toxicity based on molecular structure and recommend friends on social networks?

Or can it not only predict the citations of papers by different authors, but also discover the human aging mechanism in the gene network?

Don't tell me, the " One for All (OFA) " framework accepted as Spotlight by ICLR 2024 has achieved this "essence".

It was jointly proposed by researchers such as Professor Chen Yixin’s team at Washington University in St. Louis, Zhang Muhan from Peking University, and Tao Dacheng from JD Research Institute.

As the first general framework in the graph field , OFA enables training a single GNN model to solve classification tasks for any data set, any task type, and any scene in the graph field.

How to implement it specifically, the following is the author's contribution.

The design of universal models in the graph field faces three major difficulties

Designing a general base model to solve a variety of tasks is a long-term goal in the field of artificial intelligence. In recent years, basic large language models (LLMs) have performed well in processing natural language tasks.

However, in the field of graphs, although graph neural networks (GNNs) have performed well in different graph data, the road ahead is still unclear on how to design and train a basic graph model that can handle multiple graph tasks simultaneously.

Compared with the natural language field, the design of general models in the graph field faces many unique difficulties.

First of all, different from natural language, different graph data have completely different attributes and distributions.

For example, a molecular diagram describes how multiple atoms form different chemical substances through different force relationships. The citation relationship diagram describes the network of mutual citations between articles.

These different graph data are difficult to unify under a training framework.

Secondly, unlike all tasks in LLMs, which can be converted into unified context generation tasks, graph tasks include a variety of sub-tasks, such as node tasks, link tasks, full-graph tasks, etc.

Different subtasks usually require different task representations and different graph models.

Finally, the success of large language models is inseparable from in - context learning achieved through prompting paradigms .

In large language models, the prompt paradigm is usually a human-readable textual description of the downstream task.

However, for graph data that is unstructured and difficult to describe in language, how to design an effective graph prompt paradigm to achieve in-context learning is still an unsolved mystery.

Solved with the concept of "text map" etc.

The figure below gives the overall framework of OFA:

Specifically, OFA’s team solved the three main problems mentioned above through clever design.

To address the problem of different graph data attributes and distributions, OFA unifies all graph data by proposing the concept of Text-Attributed Graph (TAGs) . Using text graphs, OFA describes the node information and edge information in all graph data using a unified natural language framework, as shown in the following figure:

Then, OFA uses a single LLM model to learn the representation of the text in all data to obtain its embedding vectors.

These embedding vectors will serve as input features to the graphical model. In this way, graph data from different domains will be mapped to the same feature space, making it feasible to train a unified GNN model.

OFA has collected 9 graph data sets of different sizes from different fields, including citation relationship graphs, Web link graphs, knowledge graphs, and molecular graphs, as shown in the following figure:

In addition, OFA proposes Nodes-of-Interest (NOI) subgraphs and NOI prompt nodes (NOI Prompt Node) to unify different subtask types in the graph field. Here NOI represents a set of target nodes participating in the corresponding task.

For example, in the node prediction task, the NOI refers to a single node that needs to be predicted; while in the link task, the NOI includes two nodes that need to predict the link. The NOI subgraph refers to a subgraph containing h-hop neighborhoods extended around these NOI nodes.

Then, the NOI prompt node is a newly introduced node type, which is directly connected to all NOIs.

Importantly, each NOI prompt node contains description information of the current task. This information exists in the form of natural language and is represented by the same LLM as the text graph.

Since the information contained in the nodes in the NOI will be collected by the NOI prompt node after passing the message of GNNs, the GNN model only needs to make predictions through the NOI prompt node.

This way, all different task types will have a unified task representation. Specific examples are shown in the figure below:

Finally, in order to achieve in-context learning in the graph field, OFA introduces unified prompt subgraphs.

In a supervised k-way classification task scenario, this prompt subgraph contains two types of nodes: one is the NOI prompt node mentioned above, and the other is the class node (Class Node) representing k different categories. ) .

The text of each category node will describe relevant information for that category.

NOI hint nodes will be connected to all category nodes in one direction. The graph constructed in this way will be input into the graph neural network model for message passing and learning.

In the end, OFA will perform a binary classification task on each category node, and select the category node with the highest probability as the final prediction result.

Since category information exists in the cue subgraph, even if a completely new classification problem is encountered, OFA can directly predict without any fine-tuning by constructing the corresponding cue subgraph, thereby achieving zero-shot learning.

For a few-shot learning scenario, a classification task will include a query input graph and multiple support input graphs. OFA's prompt graph paradigm will connect the NOI prompt node of each support input graph to its corresponding category node, and at the same time input the query The NOI hint node of the graph is connected to all category nodes.

The subsequent prediction steps are the same as described above. In this way, each category node will receive additional information from the support input graph, thereby achieving few-shot learning under a unified paradigm.

OFA’s main contributions are summarized below:

Unified graph data distribution: By proposing text graphs and using LLM to transform text information, OFA achieves distribution alignment and unification of graph data.

Unified graph task form: Through NOI subgraphs and NOI prompt nodes, OFA achieves a unified representation of subtasks in various graph fields.

Unified graph prompting paradigm: By proposing a novel graph prompting paradigm, OFA realizes multi-scenario in-context learning in the graph domain.

Super generalization ability

The article tested the OFA framework on nine collected data sets. These tests covered ten different tasks in supervised learning scenarios, including node prediction, link prediction, and graph classification.

The purpose of the experiment is to verify the ability of a single OFA model to handle multiple tasks, in which the author compares the effects of using different LLMs (OFA-{LLM}) and training separate models for each task (OFA-ind-{LLM}) .

The comparison results are shown in the table below:

It can be seen that based on OFA's powerful generalization ability, a single graph model (OFA-st, OFA-e5, OFA-llama2-7b, OFA-llama2-13b) can perform as well as traditional models on all tasks. The performance of individually trained models (GCN, GAT, OFA-ind-st) is similar or better.

At the same time, using a more powerful LLM can bring certain performance improvements. The article further plots the representation of NOI prompt nodes for different tasks by the trained OFA model.

It can be seen that different tasks are embedded into different subspaces by the model, so that OFA can learn different tasks separately without affecting each other.

In the scenario of few samples and zero samples, OFA uses a single model for pre-training on ogbn-arxiv (citation relationship graph) , FB15K237 (knowledge graph) and Chemble (molecular graph) , and tests its performance on different downstream tasks and data sets performance. The result is as follows:

It can be seen that even in a zero-sample scenario, OFA can still achieve good results. Taken together, the experimental results well verify the powerful general performance of OFA and its potential as a basic model in the graph field.

For more research details, please refer to the original paper.

Address:
https://arxiv.org/abs/2310.00149 https://github.com/LechengKong/OneForAll