# Embedding

Graph embedding algorithms.

The attribute embedding_ assigns a vector to each node of the graph.

## Spectral

class sknetwork.embedding.Spectral(n_components: int = 2, decomposition: str = 'rw', regularization: float = -1, normalized: bool = True)[source]

Spectral embedding of graphs, based the spectral decomposition of the Laplacian matrix $$L = D - A$$ or the transition matrix of the random walk $$P = D^{-1}A$$ (default), where $$D$$ is the diagonal matrix of degrees.

Eigenvectors are considered in increasing order (for the Laplacian matrix $$L$$) or decreasing order (for the transition matrix of the random walk $$P$$) of eigenvalues, skipping the first.

Parameters:
• n_components (int (default = 2)) – Dimension of the embedding space.

• decomposition (str (laplacian or rw, default = rw)) – Matrix used for the spectral decomposition.

• regularization (float (default = -1)) – Regularization factor $$\alpha$$ so that the adjacency matrix is $$A + \alpha \frac{11^T}{n}$$. If negative, regularization is applied only if the graph is disconnected; the regularization factor $$\alpha$$ is then set to the absolute value of the parameter.

• normalized (bool (default = True)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

Variables:
• embedding (array, shape = (n, n_components)) – Embedding of the nodes.

• embedding_row (array, shape = (n_row, n_components)) – Embedding of the rows, for bipartite graphs.

• embedding_col (array, shape = (n_col, n_components)) – Embedding of the columns, for bipartite graphs.

• eigenvalues (array, shape = (n_components)) – Eigenvalues.

• eigenvectors (array, shape = (n, n_components)) – Eigenvectors.

Example

>>> from sknetwork.embedding import Spectral
>>> from sknetwork.data import karate_club
>>> spectral = Spectral(n_components=3)
>>> embedding.shape
(34, 3)


References

Belkin, M. & Niyogi, P. (2003). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, Neural computation.

fit(input_matrix: csr_matrix | ndarray, force_bipartite: bool = False) [source]

Compute the graph embedding.

If the input matrix $$B$$ is not square (e.g., biadjacency matrix of a bipartite graph) or not symmetric (e.g., adjacency matrix of a directed graph), use the adjacency matrix

$$A = \begin{bmatrix} 0 & B \\ B^T & 0 \end{bmatrix}$$

and return the embedding for both rows and columns of the input matrix $$B$$.

Parameters:

• force_bipartite (bool (default = False)) – If True, force the input matrix to be considered as a biadjacency matrix.

Returns:

self

Return type:

Spectral

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

predict(columns: bool = False) ndarray

Return the embedding of nodes.

Parameters:

columns (bool) – If True, return the prediction for columns.

Returns:

embedding_ – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray

## SVD

class sknetwork.embedding.SVD(n_components=2, regularization: None | float = None, factor_singular: float = 0.0, normalized: bool = False, solver: str | SVDSolver = 'lanczos')[source]

Graph embedding by Singular Value Decomposition of the adjacency or biadjacency matrix of the graph.

Parameters:
• n_components (int) – Dimension of the embedding.

• regularization (None or float (default = None)) – Regularization factor $$\alpha$$ so that the matrix is $$A + \alpha \frac{11^T}{n}$$.

• factor_singular (float (default = 0.)) –

Power factor $$\alpha$$ applied to the singular values on right singular vectors. The embedding of rows and columns are respectively $$U \Sigma^{1-\alpha}$$ and $$V \Sigma^\alpha$$ where:

• $$U$$ is the matrix of left singular vectors, shape (n_row, n_components)

• $$V$$ is the matrix of right singular vectors, shape (n_col, n_components)

• $$\Sigma$$ is the diagonal matrix of singular values, shape (n_components, n_components)

• normalized (bool (default = False)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('lanczos' (Lanczos algorithm, default) or SVDSolver (custom solver)) – Which solver to use.

Variables:
• embedding (array, shape = (n, n_components)) – Embedding of the nodes.

• embedding_row (array, shape = (n_row, n_components)) – Embedding of the rows, for bipartite graphs.

• embedding_col (array, shape = (n_col, n_components)) – Embedding of the columns, for bipartite graphs.

• singular_values (np.ndarray, shape = (n_components)) – Singular values.

• singular_vectors_left (np.ndarray, shape = (n_row, n_components)) – Left singular vectors.

• singular_vectors_right (np.ndarray, shape = (n_col, n_components)) – Right singular vectors.

Example

>>> from sknetwork.embedding import SVD
>>> from sknetwork.data import karate_club
>>> svd = SVD()
>>> embedding.shape
(34, 2)


References

Abdi, H. (2007). Singular value decomposition (SVD) and generalized singular value decomposition. Encyclopedia of measurement and statistics.

fit(input_matrix: csr_matrix | ndarray) GSVD

Compute the embedding of the graph.

Parameters:

Returns:

self

Return type:

GSVD

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters:

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns:

embedding_vectors – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray

## GSVD

class sknetwork.embedding.GSVD(n_components=2, regularization: None | float = None, factor_row: float = 0.5, factor_col: float = 0.5, factor_singular: float = 0.0, normalized: bool = True, solver: str | SVDSolver = 'lanczos')[source]

Graph embedding by Generalized Singular Value Decomposition of the adjacency or biadjacency matrix $$A$$. This is equivalent to the Singular Value Decomposition of the matrix $$D_1^{- \alpha_1}AD_2^{- \alpha_2}$$ where $$D_1, D_2$$ are the diagonal matrices of row weights and columns weights, respectively, and $$\alpha_1, \alpha_2$$ are parameters.

Parameters:
• n_components (int) – Dimension of the embedding.

• regularization (None or float (default = None)) – Regularization factor $$\alpha$$ so that the matrix is $$A + \alpha \frac{11^T}{n}$$.

• factor_row (float (default = 0.5)) – Power factor $$\alpha_1$$ applied to the diagonal matrix of row weights.

• factor_col (float (default = 0.5)) – Power factor $$\alpha_2$$ applied to the diagonal matrix of column weights.

• factor_singular (float (default = 0.)) –

Parameter $$\alpha$$ applied to the singular values on right singular vectors. The embedding of rows and columns are respectively $$D_1^{- \alpha_1}U \Sigma^{1-\alpha}$$ and $$D_2^{- \alpha_2}V \Sigma^\alpha$$ where:

• $$U$$ is the matrix of left singular vectors, shape (n_row, n_components)

• $$V$$ is the matrix of right singular vectors, shape (n_col, n_components)

• $$\Sigma$$ is the diagonal matrix of singular values, shape (n_components, n_components)

• normalized (bool (default = True)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('lanczos' (Lanczos algorithm, default) or SVDSolver (custom solver)) – Which solver to use.

Variables:
• embedding (array, shape = (n, n_components)) – Embedding of the nodes.

• embedding_row (array, shape = (n_row, n_components)) – Embedding of the rows, for bipartite graphs.

• embedding_col (array, shape = (n_col, n_components)) – Embedding of the columns, for bipartite graphs.

• singular_values (np.ndarray, shape = (n_components)) – Singular values.

• singular_vectors_left (np.ndarray, shape = (n_row, n_components)) – Left singular vectors.

• singular_vectors_right (np.ndarray, shape = (n_col, n_components)) – Right singular vectors.

• weights_col (np.ndarray, shape = (n2)) – Weights applied to columns.

Example

>>> from sknetwork.embedding import GSVD
>>> from sknetwork.data import karate_club
>>> gsvd = GSVD()
>>> embedding.shape
(34, 2)


References

Abdi, H. (2007). Singular value decomposition (SVD) and generalized singular value decomposition. Encyclopedia of measurement and statistics, 907-912.

fit(input_matrix: csr_matrix | ndarray) GSVD[source]

Compute the embedding of the graph.

Parameters:

Returns:

self

Return type:

GSVD

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters:

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns:

embedding_vectors – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray

## PCA

class sknetwork.embedding.PCA(n_components=2, normalized: bool = False, solver: str | SVDSolver = 'lanczos')[source]

Graph embedding by Principal Component Analysis of the adjacency or biadjacency matrix.

Parameters:
• n_components (int) – Dimension of the embedding.

• normalized (bool (default = False)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('lanczos' (Lanczos algorithm, default) or SVDSolver (custom solver)) – Which solver to use.

Variables:
• embedding (array, shape = (n, n_components)) – Embedding of the nodes.

• embedding_row (array, shape = (n_row, n_components)) – Embedding of the rows, for bipartite graphs.

• embedding_col (array, shape = (n_col, n_components)) – Embedding of the columns, for bipartite graphs.

• singular_values (np.ndarray, shape = (n_components)) – Singular values.

• singular_vectors_left (np.ndarray, shape = (n_row, n_components)) – Left singular vectors.

• singular_vectors_right (np.ndarray, shape = (n_col, n_components)) – Right singular vectors.

Example

>>> from sknetwork.embedding import PCA
>>> from sknetwork.data import karate_club
>>> pca = PCA()
>>> embedding.shape
(34, 2)


References

Jolliffe, I.T. (2002). Principal Component Analysis Series: Springer Series in Statistics.

Compute the embedding of the graph.

Parameters:

Returns:

self

Return type:

PCA

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters:

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns:

embedding_vectors – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray

## Random Projection

class sknetwork.embedding.RandomProjection(n_components: int = 2, alpha: float = 0.5, n_iter: int = 3, random_walk: bool = False, regularization: float = -1, normalized: bool = True, random_state: int | None = None)[source]

Embedding of graphs based the random projection of the adjacency matrix:

$$(I + \alpha A +... + (\alpha A)^K)G$$

where $$A$$ is the adjacency matrix, $$G$$ is a random Gaussian matrix, $$\alpha$$ is some smoothing factor and $$K$$ some non-negative integer.

Parameters:
• n_components (int (default = 2)) – Dimension of the embedding space.

• alpha (float (default = 0.5)) – Smoothing parameter.

• n_iter (int (default = 3)) – Number of power iterations of the adjacency matrix.

• random_walk (bool (default = False)) – If True, use the transition matrix of the random walk, $$P = D^{-1}A$$, instead of the adjacency matrix.

• regularization (float (default = -1)) – Regularization factor $$\alpha$$ so that the matrix is $$A + \alpha \frac{11^T}{n}$$. If negative, regularization is applied only if the graph is disconnected (and then equal to the absolute value of the parameter).

• normalized (bool (default = True)) – If True, normalize the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• random_state (int, optional) – Seed used by the random number generator.

Variables:
• embedding (array, shape = (n, n_components)) – Embedding of the nodes.

• embedding_row (array, shape = (n_row, n_components)) – Embedding of the rows, for bipartite graphs.

• embedding_col (array, shape = (n_col, n_components)) – Embedding of the columns, for bipartite graphs.

Example

>>> from sknetwork.embedding import RandomProjection
>>> from sknetwork.data import karate_club
>>> projection = RandomProjection()
>>> embedding.shape
(34, 2)


References

Zhang, Z., Cui, P., Li, H., Wang, X., & Zhu, W. (2018). Billion-scale network embedding with iterative random projection, ICDM.

fit(input_matrix: csr_matrix | ndarray, force_bipartite: bool = False) [source]

Compute the graph embedding.

Parameters:
• input_matrix (sparse.csr_matrix, np.ndarray) – Adjacency matrix or biadjacency matrix of the graph.

• force_bipartite (bool (default = False)) – If True, force the input matrix to be considered as a biadjacency matrix.

Returns:

self

Return type:

RandomProjection

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

predict(columns: bool = False) ndarray

Return the embedding of nodes.

Parameters:

columns (bool) – If True, return the prediction for columns.

Returns:

embedding_ – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray

## Louvain

class sknetwork.embedding.LouvainEmbedding(resolution: float = 1, modularity: str = 'Dugue', tol_optimization: float = 0.001, tol_aggregation: float = 0.001, n_aggregations: int = -1, shuffle_nodes: bool = False, random_state: RandomState | int | None = None, isolated_nodes: str = 'remove')[source]

Embedding of graphs induced by Louvain clustering. Each component of the embedding corresponds to a cluster obtained by Louvain.

Parameters:
• resolution (float) – Resolution parameter.

• modularity (str) – Which objective function to maximize. Can be 'Dugue', 'Newman' or 'Potts'.

• tol_optimization – Minimum increase in the objective function to enter a new optimization pass.

• tol_aggregation – Minimum increase in the objective function to enter a new aggregation pass.

• n_aggregations – Maximum number of aggregations. A negative value is interpreted as no limit.

• shuffle_nodes – Enables node shuffling before optimization.

• random_state – Random number generator or random seed. If None, numpy.random is used.

• isolated_nodes (str) – What to do with isolated column nodes. Can be 'remove' (default), 'merge' or 'keep'.

Variables:
• embedding (array, shape = (n, n_components)) – Embedding of the nodes.

• embedding_row (array, shape = (n_row, n_components)) – Embedding of the rows, for bipartite graphs.

• embedding_col (array, shape = (n_col, n_components)) – Embedding of the columns, for bipartite graphs.

• labels_row (np.ndarray) – Labels of the rows (used to build the embedding of the columns).

• labels_col (np.ndarray) – Labels of the columns (used to build the embedding of the rows).

Example

>>> from sknetwork.embedding import LouvainEmbedding
>>> from sknetwork.data import house
>>> louvain = LouvainEmbedding()
>>> embedding.shape
(5, 2)

fit(input_matrix: csr_matrix, force_bipartite: bool = False)[source]

Embedding of graphs from the clustering obtained with Louvain.

Parameters:

• force_bipartite (bool (default = False)) – If True, force the input matrix to be considered as a biadjacency matrix.

Returns:

self

Return type:

BiLouvainEmbedding

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

predict(columns: bool = False) ndarray

Return the embedding of nodes.

Parameters:

columns (bool) – If True, return the prediction for columns.

Returns:

embedding_ – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray

## Force Atlas

class sknetwork.embedding.ForceAtlas(n_components: int = 2, n_iter: int = 50, approx_radius: float = -1, lin_log: bool = False, gravity_factor: float = 0.01, repulsive_factor: float = 0.1, tolerance: float = 0.1, speed: float = 0.1, speed_max: float = 10)[source]

Force Atlas layout for displaying graphs.

Parameters:
• n_components (int) – Dimension of the graph layout.

• n_iter (int) – Number of iterations to update positions. If None, use the value of self.n_iter.

• approx_radius (float) – If a positive value is provided, only the nodes within this distance a given node are used to compute the repulsive force.

• lin_log (bool) – If True, use lin-log mode.

• gravity_factor (float) – Gravity force scaling constant.

• repulsive_factor (float) – Repulsive force scaling constant.

• tolerance (float) – Tolerance defined in the swinging constant.

• speed (float) – Speed constant.

• speed_max (float) – Constant used to impose constrain on speed.

Variables:

embedding (np.ndarray) – Layout in given dimension.

Example

>>> from sknetwork.embedding.force_atlas import ForceAtlas
>>> from sknetwork.data import karate_club
>>> force_atlas = ForceAtlas()
>>> embedding.shape
(34, 2)


References

Jacomy M., Venturini T., Heymann S., Bastian M. (2014). ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software. Plos One.

fit(adjacency: csr_matrix | ndarray, pos_init: ndarray | None = None, n_iter: int | None = None) [source]

Compute layout.

Parameters:

• n_iter (int) – Number of iterations to update positions. If None, use the value of self.n_iter.

Returns:

self

Return type:

ForceAtlas

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

predict(columns: bool = False) ndarray

Return the embedding of nodes.

Parameters:

columns (bool) – If True, return the prediction for columns.

Returns:

embedding_ – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray

## Spring

class sknetwork.embedding.Spring(n_components: int = 2, strength: float | None = None, n_iter: int = 50, tol: float = 0.0001, approx_radius: float = -1, position_init: str = 'random')[source]

Spring layout for displaying small graphs.

Parameters:
• n_components (int) – Dimension of the graph layout.

• strength (float) – Intensity of the force that moves the nodes.

• n_iter (int) – Number of iterations to update positions.

• tol (float) – Minimum relative change in positions to continue updating.

• approx_radius (float) – If a positive value is provided, only the nodes within this distance a given node are used to compute the repulsive force.

• position_init (str) – How to initialize the layout. If ‘spectral’, use Spectral embedding in dimension 2, otherwise, use random initialization.

Variables:

embedding (np.ndarray) – Layout.

Example

>>> from sknetwork.embedding import Spring
>>> from sknetwork.data import karate_club
>>> spring = Spring()
>>> embedding.shape
(34, 2)


Notes

Simple implementation designed to display small graphs.

References

Fruchterman, T. M. J., Reingold, E. M. (1991). Graph Drawing by Force-Directed Placement. Software – Practice & Experience.

fit(adjacency: csr_matrix | ndarray, position_init: ndarray | None = None, n_iter: int | None = None) [source]

Compute layout.

Parameters:

• position_init (np.ndarray) – Custom initial positions of the nodes. Shape must be (n, 2). If None, use the value of self.pos_init.

• n_iter (int) – Number of iterations to update positions. If None, use the value of self.n_iter.

Returns:

self

Return type:

Spring

fit_transform(*args, **kwargs) ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns:

embedding – Embedding.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters:

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns:

embedding_vectors – Embedding of the nodes.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

transform() ndarray

Return the embedding.

Returns:

embedding – Embedding.

Return type:

np.ndarray