GNN
Graph Neural Network.
Classifier
The attribute labels_
assigns a label to each node of the graph.
- class sknetwork.gnn.GNNClassifier(dims: int | Iterable | None = None, layer_types: str | Iterable = 'Conv', activations: str | Iterable = 'ReLu', use_bias: bool | list = True, normalizations: str | Iterable = 'both', self_embeddings: bool | Iterable = True, sample_sizes: int | list = 25, loss: BaseLoss | str = 'CrossEntropy', layers: Iterable | None = None, optimizer: BaseOptimizer | str = 'Adam', learning_rate: float = 0.01, early_stopping: bool = True, patience: int = 10, verbose: bool = False)[source]
Graph Neural Network for node classification.
- Parameters:
dims (iterable or int) – Dimension of the output of each layer (in forward direction). If an integer, dimension of the output layer (no hidden layer). Optional if
layers
is specified.layer_types (iterable or str) – Layer types (in forward direction). If a string, the same type is used at each layer. Can be
'Conv'
, graph convolutional layer (default) or'Sage'
(GraphSage).activations (iterable or str) – Activation functions (in forward direction). If a string, the same activation function is used at each layer. Can be either
'Identity'
,'Relu'
,'Sigmoid'
or'Softmax'
(default ='Relu'
).use_bias (iterable or bool) – Whether to add a bias term at each layer (in forward direction). If
True
, use a bias term at each layer.normalizations (iterable or str) – Normalizations of the adjacency matrix for message passing (in forward direction). If a string, the same type of normalization is used at each layer. Can be either
'left'
(left normalization by the degrees),'right'
(right normalization by the degrees),'both'
(symmetric normalization by the square root of degrees, default) orNone
(no normalization).self_embeddings (iterable or str) – Whether to add the embedding to each node for message passing (in forward direction). If
True
, add a self-embedding at each layer.sample_sizes (iterable or int) – Sizes of neighborhood sampled for each node (in forward direction). If an integer, the same sampling size is used at each layer. Used only for
'Sage'
layer type.loss (str (default =
'CrossEntropy'
) or BaseLoss) – Name of loss function or custom loss function.layers (iterable or None) – Custom layers (in forward directions). If used, previous parameters are ignored.
optimizer (str or optimizer) –
'Adam'
, stochastic gradient-based optimizer (default).'GD'
, gradient descent.
learning_rate (float) – Learning rate.
early_stopping (bool (default =
True
)) – Whether to use early stopping to end training. IfTrue
, training terminates when validation score is not improving for patience number of epochs.patience (int (default = 10)) – Number of iterations with no improvement to wait before stopping fitting.
verbose (bool) – Verbose mode.
- Variables:
conv1 (conv2, ...,) – Graph convolutional layers.
output (np.ndarray) – Output of the GNN.
labels (np.ndarray) – Predicted node labels.
history (dict) – Training history per epoch: {
'embedding'
,'loss'
,'train_accuracy'
,'val_accuracy'
}.
Example
>>> from sknetwork.gnn.gnn_classifier import GNNClassifier >>> from sknetwork.data import karate_club >>> from numpy.random import randint >>> graph = karate_club(metadata=True) >>> adjacency = graph.adjacency >>> labels_true = graph.labels >>> labels = {i: labels_true[i] for i in [0, 1, 33]} >>> features = adjacency.copy() >>> gnn = GNNClassifier(dims=1, early_stopping=False) >>> labels_pred = gnn.fit_predict(adjacency, features, labels, random_state=42) >>> float(round(np.mean(labels_pred == labels_true), 2)) 0.88
- backward(features: csr_matrix, labels: ndarray, mask: ndarray)
Compute backpropagation.
- Parameters:
features (sparse.csr_matrix) – Features, array of shape (n_nodes, n_features).
labels (np.ndarray) – Labels, array of shape (n_nodes,).
mask (np.ndarray) – Boolean mask, array of shape (n_nodes,).
- fit(adjacency: csr_matrix | ndarray, features: csr_matrix | ndarray, labels: ndarray, n_epochs: int = 100, validation: float = 0, reinit: bool = False, random_state: int | None = None) GNNClassifier [source]
Fit model to data and store trained parameters.
- Parameters:
adjacency (sparse.csr_matrix) – Adjacency matrix of the graph.
features (sparse.csr_matrix, np.ndarray) – Input feature of shape \((n, d)\) with \(n\) the number of nodes in the graph and \(d\) the size of feature space.
labels (dict, np.ndarray) – Known labels. Negative values ignored.
n_epochs (int (default = 100)) – Number of epochs (iterations over the whole graph).
validation (float) – Proportion of the training set used for validation (between 0 and 1).
reinit (bool (default =
False
)) – IfTrue
, reinit the trainable parameters of the GNN (weights and biases).random_state (int) – Random seed, used for reproducible results across multiple runs.
- fit_predict(*args, **kwargs) ndarray
Fit algorithm to the data and return the labels. Same parameters as the
fit
method.- Returns:
labels – Labels of the nodes.
- Return type:
np.ndarray
- fit_predict_proba(*args, **kwargs) ndarray
Fit algorithm to the data and return the distribution over labels. Same parameters as the
fit
method.- Returns:
probs – Probability distribution over labels.
- Return type:
np.ndarray
- fit_transform(*args, **kwargs) ndarray
Fit algorithm to the data and return the embedding of the nodes. Same parameters as the
fit
method.- Returns:
embedding – Embedding of the nodes.
- Return type:
np.ndarray
- forward(adjacency: list | csr_matrix, features: csr_matrix | ndarray) ndarray [source]
Perform a forward pass on the graph and return the output.
- Parameters:
adjacency (Union[list, sparse.csr_matrix]) – Adjacency matrix or list of sampled adjacency matrices.
features (sparse.csr_matrix, np.ndarray) – Features, array of shape (n_nodes, n_features).
- Returns:
output – Output of the GNN.
- Return type:
np.ndarray
- get_params()
Get parameters as dictionary.
- Returns:
params – Parameters of the algorithm.
- Return type:
dict
- predict()
Return the predicted labels.
- predict_proba()
Return the probability distribution over labels.
- print_log(*args)
Fill log with text.
- set_params(params: dict) Algorithm
Set parameters of the algorithm.
- Parameters:
params (dict) – Parameters of the algorithm.
- Returns:
self
- Return type:
Algorithm
- transform()
Return the embedding of nodes.
Convolution layers
- class sknetwork.gnn.Convolution(layer_type: str, out_channels: int, activation: BaseActivation | str | None = 'Relu', use_bias: bool = True, normalization: str = 'both', self_embeddings: bool = True, sample_size: int | None = None, loss: BaseLoss | str | None = None)[source]
Graph convolutional layer.
Apply the following function to the embedding \(X\):
\(\sigma(\bar AXW + b)\),
where \(\bar A\) is the normalized adjacency matrix (possibly with inserted self-embeddings), \(W\), \(b\) are trainable parameters and \(\sigma\) is the activation function.
- Parameters:
layer_type (str) – Layer type. Can be either
'Conv'
, convolutional operator as in [1] or'Sage'
, as in [2].out_channels (int) – Dimension of the output.
activation (str (default =
'Relu'
) or custom activation.) – Activation function. If a string, can be either'Identity'
,'Relu'
,'Sigmoid'
or'Softmax'
.use_bias (bool (default = True)) – If
True
, add a bias vector.normalization (str (default =
'both'
)) – Normalization of the adjacency matrix for message passing. Can be either ‘left’` (left normalization by the degrees),'right'
(right normalization by the degrees),'both'
(symmetric normalization by the square root of degrees, default) orNone
(no normalization).self_embeddings (bool (default = True)) – If
True
, consider self-embedding in addition to neighbors embedding for each node of the graph.sample_size (int (default = 25)) – Size of neighborhood sampled for each node. Used only for
'Sage'
layer.
- Variables:
weight (np.ndarray,) – Trainable weight matrix.
bias (np.ndarray) – Bias vector.
embedding (np.ndarray) – Embedding of the nodes (before activation).
output (np.ndarray) – Output of the layer (after activation).
References
[1] Kipf, T., & Welling, M. (2017). Semi-supervised Classification with Graph Convolutional Networks. 5th International Conference on Learning Representations.
[2] Hamilton, W. Ying, R., & Leskovec, J. (2017) Inductive Representation Learning on Large Graphs. NIPS
- forward(adjacency: csr_matrix | ndarray, features: csr_matrix | ndarray) ndarray [source]
Compute graph convolution.
- Parameters:
adjacency – Adjacency matrix of the graph.
features (sparse.csr_matrix, np.ndarray) – Input feature of shape \((n, d)\) with \(n\) the number of nodes in the graph and \(d\) the size of feature space.
- Returns:
output – Output of the layer.
- Return type:
np.ndarray
Activation functions
- class sknetwork.gnn.BaseActivation(name: str = 'custom')[source]
Base class for activation functions. :param name: Name of the activation function. :type name: str
- static gradient(signal: ndarray, direction: ndarray) ndarray [source]
Gradient of the activation function.
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal.
direction (np.ndarray, shape (n_samples, n_channels)) – Direction where the gradient is taken.
- Returns:
gradient – Gradient.
- Return type:
np.ndarray, shape (n_samples, n_channels)
- class sknetwork.gnn.ReLu[source]
ReLu (Rectified Linear Unit) activation function:
\(\sigma(x) = \max(0, x)\)
- class sknetwork.gnn.Sigmoid[source]
Sigmoid activation function:
\(\sigma(x) = \frac{1}{1+e^{-x}}\) Also known as the logistic function.
- class sknetwork.gnn.Softmax[source]
Softmax activation function:
\(\sigma(x) = (\frac{e^{x_1}}{\sum_{i=1}^N e^{x_i})},\ldots,\frac{e^{x_N}}{\sum_{i=1}^N e^{x_i})})\)
where \(N\) is the number of channels.
Loss functions
- class sknetwork.gnn.BaseLoss(name: str = 'custom')[source]
Base class for loss functions.
- static gradient(signal: ndarray, direction: ndarray) ndarray
Gradient of the activation function.
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal.
direction (np.ndarray, shape (n_samples, n_channels)) – Direction where the gradient is taken.
- Returns:
gradient – Gradient.
- Return type:
np.ndarray, shape (n_samples, n_channels)
- static loss(signal: ndarray, labels: ndarray) float [source]
Get the loss value.
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal (before activation).
labels (np.ndarray, shape (n_samples)) – True labels.
- static loss_gradient(signal: ndarray, labels: ndarray) ndarray [source]
Gradient of the loss function.
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal.
labels (np.ndarray, shape (n_samples,)) – True labels.
- Returns:
gradient – Gradient.
- Return type:
np.ndarray, shape (n_samples, n_channels)
- static output(signal: ndarray) ndarray
Output of the activation function.
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal.
- Returns:
output – Output signal.
- Return type:
np.ndarray, shape (n_samples, n_channels)
- class sknetwork.gnn.CrossEntropy[source]
Cross entropy loss with softmax activation.
For a single sample with value \(x\) and true label \(y\), the cross-entropy loss is:
\(-\sum_i 1_{\{y=i\}} \log (p_i)\)
with
\(p_i = e^{x_i} / \sum_j e^{x_j}\).
For \(n\) samples, return the average loss.
- static gradient(signal: ndarray, direction: ndarray) ndarray
Gradient of the softmax function.
- static loss(signal: ndarray, labels: ndarray) float [source]
Get loss value.
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal (before activation). The number of channels must be at least 2.
labels (np.ndarray, shape (n_samples)) – True labels.
- Returns:
value – Loss value.
- Return type:
float
- static loss_gradient(signal: ndarray, labels: ndarray) ndarray [source]
Get the gradient of the loss function (including activation).
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal (before activation).
labels (np.ndarray, shape (n_samples)) – True labels.
- Returns:
gradient – Gradient of the loss function.
- Return type:
float
- static output(signal: ndarray) ndarray
Output of the softmax function (rows sum to 1).
- class sknetwork.gnn.BinaryCrossEntropy[source]
Binary cross entropy loss with sigmoid activation.
For a single sample with true label \(y\) and predicted probability \(p\), the binary cross-entropy loss is:
\(-y \log (p) - (1-y) \log (1 - p).\)
For \(n\) samples, return the average loss.
- static gradient(signal: ndarray, direction: ndarray) ndarray
Gradient of the sigmoid function.
- static loss(signal: ndarray, labels: ndarray) float [source]
Get loss value.
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal (before activation). The number of channels must be at least 2.
labels (np.ndarray, shape (n_samples)) – True labels.
- Returns:
value – Loss value.
- Return type:
float
- static loss_gradient(signal: ndarray, labels: ndarray) ndarray [source]
Get the gradient of the loss function (including activation).
- Parameters:
signal (np.ndarray, shape (n_samples, n_channels)) – Input signal (before activation).
labels (np.ndarray, shape (n_samples)) – True labels.
- Returns:
gradient – Gradient of the loss function.
- Return type:
float
- static output(signal: ndarray) ndarray
Output of the sigmoid function.
Optimizers
- class sknetwork.gnn.BaseOptimizer(learning_rate)[source]
Base class for optimizers.
- Parameters:
learning_rate (float (default = 0.01)) – Learning rate for updating weights.
- class sknetwork.gnn.ADAM(learning_rate: float = 0.01, beta1: float = 0.9, beta2: float = 0.999, eps: float = 1e-08)[source]
Adam optimizer.
- Parameters:
learning_rate (float (default = 0.01)) – Learning rate for updating weights.
beta1 (float) – Coefficients used for computing running averages of gradients.
beta2 (float) – Coefficients used for computing running averages of gradients.
eps (float (default = 1e-8)) – Term added to the denominator to improve stability.
References
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. 3rd International Conference for Learning Representation.