Gated Graph Sequence Neural Networks
Graph-structured data appears frequently in domains including chemistry, natural language semantics, social networks, and knowledge bases. In this work, we study feature learning techniques for graph-structured inputs.
Our starting point is previous work on Graph Neural Networks (Scarselli et al., 2009), which we modify to use gated recurrent units and modern optimization techniques and then extend to output sequences.
The result is a flexible and broadly useful class of neural network models that has favorable inductive biases relative to purely sequence-based models (e.g., LSTMs) when the problem is graph-structured.
The paper demonstrate the capabilities on some simple AI (bAbI) and graph algorithm learning tasks. We then show it achieves state-of-the-art performance on a problem from program verification, in which subgraphs need to be described as abstract data structures.
Our main contribution is an extension of Graph Neural Networks that outputs sequences. Previous work on feature learning for graph-structured inputs has focused on models that produce single outputs such as graph-level classifications, but many problems with graph inputs require outputting sequences.
Examples include paths on a graph, enumerations of graph nodes with desirable properties, or sequences of global classifications mixed with, for example, a start and end node. We are not aware of existing graph feature learning work suitable for this problem.
Our motivating application comes from program verification and requires outputting logical formulas, which we formulate as a sequential output problem.
A secondary contribution is highlighting that Graph Neural Networks (and further extensions we develop here) are a broadly useful class of neural network model that is applicable to many problems currently facing the field.
There are two settings for feature learning on graphs:
Learning a representation of the input graph.
Learning representations of the internal state during the process of producing a sequence of outputs.
Learning a representation of the input graph is mostly achieved by previous work on Graph Neural Networks (Scarselli et al., 2009). we make several minor adaptations of this framework, including changing it to use modern practices around Recurrent Neural Networks.
Learning representations of the internal state during the process of producing a sequence of outputs is important because we desire outputs from graph structured problems that are not solely individual classifications.
In these cases, the challenge is how to learn features on the graph that encode the partial output sequence that has already been produced (e.g., the path so far if outputting a path) and that still needs to be produced (e.g., the remaining path). We will show how the GNN framework can be adapted to these settings, leading to a novel graph-based neural network model that we call Gated Graph Sequence Neural Networks (GGS-NNs).
We illustrate aspects of this general model in experiments on bAbI tasks (Weston et al., 2015) and graph algorithm learning tasks that illustrate the capabilities of the model.
We then present an application to the verification of computer programs. When attempting to prove properties such as memory safety (i.e., that there are no null pointer dereferences in a program), a core problem is to find mathematical descriptions of the data structures used in a program.
Following Brockschmidt et al. (2015), we have phrased this as a machine learning problem where we will learn to map from a set of input graphs, representing the state of memory, to a logical description of the data structures that have been instantiated. Whereas Brockschmidt et al. (2015) relied on a large amount of hand-engineering of features, we show that the system can be replaced with a GGS-NN at no cost in accuracy.
Let us cover all three kinds of Networks.
GNNs are a general neural network architecture defined according to a graph structure G = (V, E) . V - Nodes , E - Edges
GNNs have nodes , edges. The edges are pairs. This work is focused on directed graphs. But this framework can be adapted to undirected graphs too.
In this model, An iterative procedure propagates node representations.
Initial node representations are set to arbitrary values, then each node representation is updated following the recurrence until convergence.
This is generally a linear or neural network mapping.
The Mathematical equation for basic recurrence of the propagation model is given in the Paper
The Parameter tying and sparsity is constructed as a recurrent matrix A.
The matrix A determines how nodes in the graph communicate with each other.
The sparsity structure corresponds to the edges of the graph, and the parameters in each submatrix are determined by the edge type and direction.
A contain two columns of blocks i.e Outgoing edges and Incoming edges.
The initialization step, which copies node annotations into the first components of the hidden state and pads the rest with zeros.
Then the information is passed between different nodes of the graph via incoming and outgoing edges with parameters dependent on the edge type and direction.
The remaining are GRU-like updates that incorporate information from the other nodes and from the previous timestep to update each node’s hidden state. z and r are the update and reset gates, logistic sigmoid function.
We initially experimented with a vanilla recurrent neural network-style update, but in preliminary experiments we found this GRU-like propagation step to be more effective.
Gated Graph Sequence Neural Networks (GGS-NNs) contains several GG-NNs operate in sequence to produce an output sequence.
Sequence outputs with observed annotations Consider the task of making a sequence of predictions for a graph, where each prediction is only about a part of the graph. In order to ensure we predict an output for each part of the graph exactly once, it suffices to have one bit per node, indicating whether the node has been “explained” so far. In some settings, a small number of annotations are sufficient to capture the state of the output procedure. When this is the case, we may want to directly input this information into the model via labels indicating target intermediate annotations. In some cases, these annotations may be sufficient, in that we can define a model where the GG-NNs are rendered conditionally independent given the annotations.
In this case, at training time, given the annotations the sequence prediction task decomposes into single step prediction tasks and can be trained as separate GG-NNs. At test time, predicted annotations from one step will be used as input to the next step. This is analogous to training directed graphical models when data is fully observed.
Sequence outputs with latent annotations More generally, when intermediate node annotations are not available during training, we treat them as hidden units in the network, and train the whole model jointly by backpropagating through the whole sequence.
BABI Tasks:
The bAbI tasks are meant to test reasoning capabilities that AI systems should be capable of. In the bAbI suite, there are 20 tasks that test basic forms of reasoning like deduction, induction, counting, and path-finding.
PROGRAM VERIFICATION WITH GGS-NNS:
GGS-NNs are motivated by a practical application in program verification. A crucial step in automatic program verification is the inference of program invariants, which approximate the set of program states reachable in an execution. Finding invariants about data structures is an open problem.
For more information and Mathematical calculations please refer GATED GRAPH SEQUENCE NEURAL NETWORKS Paper