# Embedding¶

Graph embedding algorithms.

The attribute embedding_ assigns a vector to each node of the graph.

## Spectral¶

class sknetwork.embedding.Spectral(n_components: int = 2, regularization: Union[None, float] = 0.01, relative_regularization: bool = True, scaling: float = 0.5, normalized: bool = True, solver: str = 'auto')[source]

Spectral embedding of graphs, based the spectral decomposition of the transition matrix $$P = D^{-1}A$$. Eigenvectors are considered in decreasing order of eigenvalues, skipping the first eigenvector.

• Graphs

See BiSpectral for digraphs and bigraphs.

Parameters
• n_components (int (default = 2)) – Dimension of the embedding space.

• regularization (None or float (default = 0.01)) – Add edges of given weight between all pairs of nodes.

• relative_regularization (bool (default = True)) – If True, consider the regularization as relative to the total weight of the graph.

• scaling (float (non-negative, default = 0.5)) – Scaling factor $$\alpha$$ so that each component is divided by $$(1 - \lambda)^\alpha$$, with $$\lambda$$ the corresponding eigenvalue of the transition matrix $$P$$. Require regularization if positive and the graph is not connected. The default value $$\alpha=\frac 1 2$$ equalizes the energy levels of the corresponding mechanical system.

• normalized (bool (default = True)) – If True, normalize the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('auto', 'halko', 'lanczos' (default = 'auto')) –

Which eigenvalue solver to use.

• 'auto' call the auto_solver function.

• 'halko': randomized method, fast but less accurate than 'lanczos' for ill-conditioned matrices.

• 'lanczos': power-iteration based method.

Variables
• embedding_ (array, shape = (n, n_components)) – Embedding of the nodes.

• eigenvalues_ (array, shape = (n_components)) – Eigenvalues in decreasing order (first eigenvalue ignored).

• eigenvectors_ (array, shape = (n, n_components)) – Corresponding eigenvectors.

• regularization_ (None or float) – Regularization factor added to all pairs of nodes.

Example

>>> from sknetwork.embedding import Spectral
>>> from sknetwork.data import karate_club
>>> spectral = Spectral()
>>> adjacency = karate_club()
>>> embedding = spectral.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


References

Belkin, M. & Niyogi, P. (2003). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, Neural computation.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.spectral.Spectral[source]

Compute the graph embedding.

Parameters

adjacency – Adjacency matrix of the graph (symmetric matrix).

Returns

self

Return type

Spectral

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray[source]

Predict the embedding of new nodes, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

class sknetwork.embedding.BiSpectral(n_components: int = 2, regularization: Union[None, float] = 0.01, relative_regularization: bool = True, scaling: float = 0.5, normalized: bool = True, solver: str = 'auto')[source]

Spectral embedding of bipartite and directed graphs, based the spectral embedding of the corresponding undirected graph, with adjacency matrix:

$$A = \begin{bmatrix} 0 & B \\ B^T & 0 \end{bmatrix}$$

where $$B$$ is the biadjacency matrix of the bipartite graph or the adjacency matrix of the directed graph.

• Digraphs

• Bigraphs

Parameters
• n_components (int (default = 2)) – Dimension of the embedding space.

• regularization (None or float (default = 0.01)) – Add edges of given weight between all pairs of nodes.

• relative_regularization (bool (default = True)) – If True, consider the regularization as relative to the total weight of the graph.

• scaling (float (non-negative, default = 0.5)) – Scaling factor $$\alpha$$ so that each component is divided by $$(1 - \lambda)^\alpha$$, with $$\lambda$$ the corresponding eigenvalue of the transition matrix $$P$$. Require regularization if positive and the graph is not connected. The default value $$\alpha=\frac 1 2$$ equalizes the energy levels of the corresponding mechanical system.

• normalized (bool (default = True)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('auto', 'halko', 'lanczos' (default = 'auto')) –

Which eigenvalue solver to use.

• 'auto' call the auto_solver function.

• 'halko': randomized method, fast but less accurate than 'lanczos' for ill-conditioned matrices.

• 'lanczos': power-iteration based method.

Variables
• embedding_ (array, shape = (n_row, n_components)) – Embedding of the rows.

• embedding_row_ (array, shape = (n_row, n_components)) – Embedding of the rows (copy of embedding_).

• embedding_col_ (array, shape = (n_col, n_components)) – Embedding of the columns.

• eigenvalues_ (array, shape = (n_components)) – Eigenvalues in increasing order (first eigenvalue ignored).

• eigenvectors_ (array, shape = (n, n_components)) – Corresponding eigenvectors.

• regularization_ (None or float) – Regularization factor added to all pairs of nodes.

Example

>>> from sknetwork.embedding import BiSpectral
>>> from sknetwork.data import movie_actor
>>> bispectral = BiSpectral()
>>> biadjacency = movie_actor()
>>> embedding = bispectral.fit_transform(biadjacency)
>>> embedding.shape
(15, 2)


References

Belkin, M. & Niyogi, P. (2003). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, Neural computation.

fit(biadjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.spectral.BiSpectral[source]

Spectral embedding of the bipartite graph considered as undirected, with adjacency matrix:

$$A = \begin{bmatrix} 0 & B \\ B^T & 0 \end{bmatrix}$$

where $$B$$ is the biadjacency matrix (or the adjacency matrix of a directed graph).

Parameters

biadjacency – Biadjacency matrix of the graph.

Returns

self

Return type

BiSpectral

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray[source]

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

class sknetwork.embedding.LaplacianEmbedding(n_components: int = 2, regularization: Union[None, float] = 0.01, relative_regularization: bool = True, scaling: float = 0.5, normalized: bool = True, solver: str = 'auto')[source]

Spectral embedding of graphs, based the spectral decomposition of the Laplacian matrix $$L = D - A$$. Eigenvectors are considered in increasing order of eigenvalues, skipping the first eigenvector.

• Graphs

Parameters
• n_components (int (default = 2)) – Dimension of the embedding space.

• regularization (None or float (default = 0.01)) – Add edges of given weight between all pairs of nodes.

• relative_regularization (bool (default = True)) – If True, consider the regularization as relative to the total weight of the graph.

• scaling (float (non-negative, default = 0.5)) – Scaling factor $$\alpha$$ so that each component is divided by $$\lambda^\alpha$$, with $$\lambda$$ the corresponding eigenvalue of the Laplacian matrix $$L$$. Require regularization if positive and the graph is not connected. The default value $$\alpha=\frac 1 2$$ equalizes the energy levels of the corresponding mechanical system.

• normalized (bool (default = True)) – If True, normalize the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('auto', 'halko', 'lanczos' (default = 'auto')) –

Which eigenvalue solver to use.

• 'auto' call the auto_solver function.

• 'halko': randomized method, fast but less accurate than 'lanczos' for ill-conditioned matrices.

• 'lanczos': power-iteration based method.

Variables
• embedding_ (array, shape = (n, n_components)) – Embedding of the nodes.

• eigenvalues_ (array, shape = (n_components)) – Eigenvalues in increasing order (first eigenvalue ignored).

• eigenvectors_ (array, shape = (n, n_components)) – Corresponding eigenvectors.

• regularization_ (None or float) – Regularization factor added to all pairs of nodes.

Example

>>> from sknetwork.embedding import LaplacianEmbedding
>>> from sknetwork.data import karate_club
>>> laplacian = LaplacianEmbedding()
>>> adjacency = karate_club()
>>> embedding = laplacian.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


References

Belkin, M. & Niyogi, P. (2003). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, Neural computation.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.spectral.LaplacianEmbedding[source]

Compute the graph embedding.

Parameters

adjacency – Adjacency matrix of the graph (symmetric matrix).

Returns

self

Return type

LaplacianEmbedding

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

## SVD¶

class sknetwork.embedding.SVD(n_components=2, regularization: Optional[float] = None, relative_regularization: bool = True, factor_singular: float = 0.0, normalized: bool = False, solver: Union[str, sknetwork.linalg.svd_solver.SVDSolver] = 'auto')[source]

Graph embedding by Singular Value Decomposition of the adjacency or biadjacency matrix.

• Graphs

• Digraphs

• Bigraphs

Parameters
• n_components (int) – Dimension of the embedding.

• regularization (None or float (default = None)) – Implicitly add edges of given weight between all pairs of nodes.

• relative_regularization (bool (default = True)) – If True, consider the regularization as relative to the total weight of the graph.

• factor_singular (float (default = 0.)) –

Power factor $$\alpha$$ applied to the singular values on right singular vectors. The embedding of rows and columns are respectively $$U \Sigma^{1-\alpha}$$ and $$V \Sigma^\alpha$$ where:

• $$U$$ is the matrix of left singular vectors, shape (n_row, n_components)

• $$V$$ is the matrix of right singular vectors, shape (n_col, n_components)

• $$\Sigma$$ is the diagonal matrix of singular values, shape (n_components, n_components)

• normalized (bool (default = False)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('auto', 'halko', 'lanczos' or SVDSolver) –

Which singular value solver to use.

• 'auto': call the auto_solver function.

• 'halko': randomized method, fast but less accurate than 'lanczos' for ill-conditioned matrices.

• 'lanczos': power-iteration based method.

• SVDSolver: custom solver.

Variables
• embedding_ (np.ndarray, shape = (n_row, n_components)) – Embedding of the rows.

• embedding_row_ (np.ndarray, shape = (n_row, n_components)) – Embedding of the rows (copy of embedding_).

• embedding_col_ (np.ndarray, shape = (n_col, n_components)) – Embedding of the columns.

• singular_values_ (np.ndarray, shape = (n_components)) – Singular values.

• singular_vectors_left_ (np.ndarray, shape = (n_row, n_components)) – Left singular vectors.

• singular_vectors_right_ (np.ndarray, shape = (n_col, n_components)) – Right singular vectors.

• regularization_ (None or float) – Regularization factor added to all pairs of nodes.

Example

>>> from sknetwork.embedding import SVD
>>> from sknetwork.data import karate_club
>>> svd = SVD()
>>> adjacency = karate_club()
>>> embedding = svd.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


References

Abdi, H. (2007). Singular value decomposition (SVD) and generalized singular value decomposition. Encyclopedia of measurement and statistics.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.svd.GSVD

Compute the embedding of the graph.

Parameters

adjacency – Adjacency or biadjacency matrix of the graph.

Returns

self

Return type

GSVD

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

## GSVD¶

class sknetwork.embedding.GSVD(n_components=2, regularization: Optional[float] = None, relative_regularization: bool = True, factor_row: float = 0.5, factor_col: float = 0.5, factor_singular: float = 0.0, normalized: bool = True, solver: Union[str, sknetwork.linalg.svd_solver.SVDSolver] = 'auto')[source]

Graph embedding by Generalized Singular Value Decomposition of the adjacency or biadjacency matrix $$A$$. This is equivalent to the Singular Value Decomposition of the matrix $$D_1^{- \alpha_1}AD_2^{- \alpha_2}$$ where $$D_1, D_2$$ are the diagonal matrices of row weights and columns weights, respectively, and $$\alpha_1, \alpha_2$$ are parameters.

• Graphs

• Digraphs

• Bigraphs

Parameters
• n_components (int) – Dimension of the embedding.

• regularization (None or float (default = None)) – Implicitly add edges of given weight between all pairs of nodes.

• relative_regularization (bool (default = True)) – If True, consider the regularization as relative to the total weight of the graph.

• factor_row (float (default = 0.5)) – Power factor $$\alpha_1$$ applied to the diagonal matrix of row weights.

• factor_col (float (default = 0.5)) – Power factor $$\alpha_2$$ applied to the diagonal matrix of column weights.

• factor_singular (float (default = 0.)) –

Parameter $$\alpha$$ applied to the singular values on right singular vectors. The embedding of rows and columns are respectively $$D_1^{- \alpha_1}U \Sigma^{1-\alpha}$$ and $$D_2^{- \alpha_2}V \Sigma^\alpha$$ where:

• $$U$$ is the matrix of left singular vectors, shape (n_row, n_components)

• $$V$$ is the matrix of right singular vectors, shape (n_col, n_components)

• $$\Sigma$$ is the diagonal matrix of singular values, shape (n_components, n_components)

• normalized (bool (default = True)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('auto', 'halko', 'lanczos' or SVDSolver) –

Which singular value solver to use.

• 'auto': call the auto_solver function.

• 'halko': randomized method, fast but less accurate than 'lanczos' for ill-conditioned matrices.

• 'lanczos': power-iteration based method.

• SVDSolver: custom solver.

Variables
• embedding_ (np.ndarray, shape = (n1, n_components)) – Embedding of the rows.

• embedding_row_ (np.ndarray, shape = (n1, n_components)) – Embedding of the rows (copy of embedding_).

• embedding_col_ (np.ndarray, shape = (n2, n_components)) – Embedding of the columns.

• singular_values_ (np.ndarray, shape = (n_components)) – Singular values.

• singular_vectors_left_ (np.ndarray, shape = (n_row, n_components)) – Left singular vectors.

• singular_vectors_right_ (np.ndarray, shape = (n_col, n_components)) – Right singular vectors.

• regularization_ (None or float) – Regularization factor added to all pairs of nodes.

• weights_col_ (np.ndarray, shape = (n2)) – Weights applied to columns.

Example

>>> from sknetwork.embedding import GSVD
>>> from sknetwork.data import karate_club
>>> gsvd = GSVD()
>>> adjacency = karate_club()
>>> embedding = gsvd.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


References

Abdi, H. (2007). Singular value decomposition (SVD) and generalized singular value decomposition. Encyclopedia of measurement and statistics, 907-912.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.svd.GSVD[source]

Compute the embedding of the graph.

Parameters

adjacency – Adjacency or biadjacency matrix of the graph.

Returns

self

Return type

GSVD

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray[source]

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

## PCA¶

class sknetwork.embedding.PCA(n_components=2, normalized: bool = False, solver: Union[str, sknetwork.linalg.svd_solver.SVDSolver] = 'auto')[source]

Graph embedding by Principal Component Analysis of the adjacency or biadjacency matrix.

• Graphs

• Digraphs

• Bigraphs

Parameters
• n_components (int) – Dimension of the embedding.

• normalized (bool (default = False)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• solver ('auto', 'halko', 'lanczos' or SVDSolver) –

Which singular value solver to use.

• 'auto': call the auto_solver function.

• 'halko': randomized method, fast but less accurate than 'lanczos' for ill-conditioned matrices.

• 'lanczos': power-iteration based method.

• SVDSolver: custom solver.

Variables
• embedding_ (np.ndarray, shape = (n_row, n_components)) – Embedding of the rows.

• embedding_row_ (np.ndarray, shape = (n_row, n_components)) – Embedding of the rows (copy of embedding_).

• embedding_col_ (np.ndarray, shape = (n_col, n_components)) – Embedding of the columns.

• singular_values_ (np.ndarray, shape = (n_components)) – Singular values.

• singular_vectors_left_ (np.ndarray, shape = (n_row, n_components)) – Left singular vectors.

• singular_vectors_right_ (np.ndarray, shape = (n_col, n_components)) – Right singular vectors.

• regularization_ (None or float) – Regularization factor added to all pairs of nodes.

Example

>>> from sknetwork.embedding import PCA
>>> from sknetwork.data import karate_club
>>> pca = PCA()
>>> adjacency = karate_club()
>>> embedding = pca.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


References

Jolliffe, I.T. (2002). Principal Component Analysis Series: Springer Series in Statistics.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.svd.PCA[source]

Compute the embedding of the graph.

Parameters

adjacency – Adjacency or biadjacency matrix of the graph.

Returns

self

Return type

PCA

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

## Random Projection¶

class sknetwork.embedding.RandomProjection(n_components: int = 2, alpha: float = 0.5, n_iter: int = 3, random_walk: bool = False, normalized: bool = True, random_state: Optional[int] = None)[source]

Embedding of graphs based the random projection of the adjacency matrix:

$$(I + \alpha A +... + (\alpha A)^K)G$$

where $$A$$ is the adjacency matrix, $$G$$ is a random Gaussian matrix, $$\alpha$$ is some smoothing factor and $$K$$ non-negative integer.

• Graphs

• Digraphs

Parameters
• n_components (int (default = 2)) – Dimension of the embedding space.

• alpha (float (default = 0.5)) – Smoothing parameter.

• n_iter (int (default = 3)) – Number of power iterations of the adjacency matrix.

• random_walk (bool (default = False)) – If True, use the transition matrix of the random walk, $$P = D^{-1}A$$, instead of the adjacency matrix.

• normalized (bool (default = True)) – If True, normalize the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

• random_state (int, optional) – Seed used by the random number generator.

Variables

embedding_ (array, shape = (n, n_components)) – Embedding of the nodes.

Example

>>> from sknetwork.embedding import RandomProjection
>>> from sknetwork.data import karate_club
>>> projection = RandomProjection()
>>> adjacency = karate_club()
>>> embedding = projection.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


References

Zhang, Z., Cui, P., Li, H., Wang, X., & Zhu, W. (2018). Billion-scale network embedding with iterative random projection, ICDM.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.random_projection.RandomProjection[source]

Compute the graph embedding.

Parameters

adjacency – Adjacency matrix of the graph.

Returns

self

Return type

RandomProjection

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

class sknetwork.embedding.BiRandomProjection(n_components: int = 2, alpha: float = 0.5, n_iter: int = 3, random_walk: bool = False, normalized: bool = True)[source]

Embedding of bipartite graphs, based the random projection of the corresponding adjacency matrix:

$$A = \begin{bmatrix} 0 & B \\ B^T & 0 \end{bmatrix}$$

where $$B$$ is the biadjacency matrix of the bipartite graph.

• Bigraphs

Parameters
• n_components (int (default = 2)) – Dimension of the embedding space.

• alpha (float (default = 0.5)) – Smoothing parameter.

• n_iter (int (default = 3)) – Number of power iterations of the adjacency matrix.

• random_walk (bool (default = False)) – If True, use the transition matrix of the random walk, $$P = D^{-1}A$$, instead of the adjacency matrix.

• normalized (bool (default = True)) – If True, normalized the embedding so that each vector has norm 1 in the embedding space, i.e., each vector lies on the unit sphere.

Variables
• embedding_ (array, shape = (n_row, n_components)) – Embedding of the rows.

• embedding_row_ (array, shape = (n_row, n_components)) – Embedding of the rows (copy of embedding_).

• embedding_col_ (array, shape = (n_col, n_components)) – Embedding of the columns.

Example

>>> from sknetwork.embedding import BiRandomProjection
>>> from sknetwork.data import movie_actor
>>> biprojection = BiRandomProjection()
>>> biadjacency = movie_actor()
>>> embedding = biprojection.fit_transform(biadjacency)
>>> embedding.shape
(15, 2)


References

Zhang, Z., Cui, P., Li, H., Wang, X., & Zhu, W. (2018). Billion-scale network embedding with iterative random projection, ICDM.

fit(biadjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.random_projection.BiRandomProjection[source]

Compute the embedding.

Parameters

biadjacency – Biadjacency matrix of the graph.

Returns

self

Return type

BiRandomProjection

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

## Louvain¶

class sknetwork.embedding.LouvainEmbedding(resolution: float = 1, modularity: str = 'dugue', tol_optimization: float = 0.001, tol_aggregation: float = 0.001, n_aggregations: int = - 1, shuffle_nodes: bool = False, random_state: Optional[Union[numpy.random.mtrand.RandomState, int]] = None, isolated_nodes: str = 'remove')[source]

Embedding of graphs induced by Louvain clustering. Each component of the embedding corresponds to a cluster obtained by Louvain.

Parameters
• resolution (float) – Resolution parameter.

• modularity (str) – Which objective function to maximize. Can be 'dugue', 'newman' or 'potts'.

• tol_optimization – Minimum increase in the objective function to enter a new optimization pass.

• tol_aggregation – Minimum increase in the objective function to enter a new aggregation pass.

• n_aggregations – Maximum number of aggregations. A negative value is interpreted as no limit.

• shuffle_nodes – Enables node shuffling before optimization.

• random_state – Random number generator or random seed. If None, numpy.random is used.

• isolated_nodes (str) – What to do with isolated nodes. Can be 'remove' (default), 'merge' or 'keep'.

Variables

embedding_ (array, shape = (n, n_components)) – Embedding of the nodes.

Example

>>> from sknetwork.embedding import LouvainEmbedding
>>> from sknetwork.data import karate_club
>>> louvain = LouvainEmbedding()
>>> adjacency = karate_club()
>>> embedding = louvain.fit_transform(adjacency)
>>> embedding.shape
(34, 4)

fit(adjacency: scipy.sparse.csr.csr_matrix)[source]

Embedding of bipartite graphs from a clustering obtained with Louvain.

Parameters

adjacency – Adjacency matrix of the graph.

Returns

self

Return type

BiLouvainEmbedding

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray[source]

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of rows. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

class sknetwork.embedding.BiLouvainEmbedding(resolution: float = 1, modularity: str = 'dugue', tol_optimization: float = 0.001, tol_aggregation: float = 0.001, n_aggregations: int = - 1, shuffle_nodes: bool = False, random_state: Optional[Union[numpy.random.mtrand.RandomState, int]] = None, isolated_nodes: str = 'remove')[source]

Embedding of bipartite graphs induced by Louvain clustering. Each component of the embedding corresponds to a cluster obtained by Louvain.

Parameters
• resolution (float) – Resolution parameter.

• modularity (str) – Which objective function to maximize. Can be 'dugue', 'newman' or 'potts'.

• tol_optimization – Minimum increase in the objective function to enter a new optimization pass.

• tol_aggregation – Minimum increase in the objective function to enter a new aggregation pass.

• n_aggregations – Maximum number of aggregations. A negative value is interpreted as no limit.

• shuffle_nodes – Enables node shuffling before optimization.

• random_state – Random number generator or random seed. If None, numpy.random is used.

• isolated_nodes (str) – What to do with isolated column nodes. Can be 'remove' (default), 'merge' or 'keep'.

Variables
• embedding_ (array, shape = (n, n_components)) – Embedding of the nodes.

• embedding_row_ (array, shape = (n_row, n_components)) – Embedding of the rows (copy of embedding_).

• embedding_col_ (array, shape = (n_col, n_components)) – Embedding of the columns.

• labels_row_ (np.ndarray) – Labels of the rows (used to build the embedding of the columns).

• labels_col_ (np.ndarray) – Labels of the columns (used to build the embedding of the rows).

Example

>>> from sknetwork.embedding import BiLouvainEmbedding
>>> from sknetwork.data import movie_actor
>>> bilouvain = BiLouvainEmbedding()
>>> biadjacency = movie_actor()
>>> embedding = bilouvain.fit_transform(biadjacency)
>>> embedding.shape
(15, 4)

fit(biadjacency: scipy.sparse.csr.csr_matrix)[source]

Embedding of bipartite graphs from the clustering obtained with Louvain.

Parameters

biadjacency – Biadjacency matrix of the graph.

Returns

self

Return type

BiLouvainEmbedding

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray[source]

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of rows. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

## Hierarchical Louvain¶

class sknetwork.embedding.HLouvainEmbedding(n_components: int = 2, scale: float = 0.1, resolution: float = 1, tol_optimization: float = 0.001, tol_aggregation: float = 0.001, n_aggregations: int = - 1, shuffle_nodes: bool = False, random_state: Optional[Union[numpy.random.mtrand.RandomState, int]] = None, verbose: bool = False)[source]

Embedding of graphs based on the hierarchical Louvain algorithm with random scattering per level.

Parameters
• n_components (int) – Dimension of the embedding.

• scale (float) – Dilution factor to be applied on the random vector to be added at each iteration of the clustering method.

• resolution – Resolution parameter.

• tol_optimization – Minimum increase in the objective function to enter a new optimization pass.

• tol_aggregation – Minimum increase in the objective function to enter a new aggregation pass.

• n_aggregations – Maximum number of aggregations. A negative value is interpreted as no limit.

• shuffle_nodes – Enables node shuffling before optimization.

• random_state – Random number generator or random seed. If None, numpy.random is used.

Variables

embedding_ (array, shape = (n, n_components)) – Embedding of the nodes.

Example

>>> from sknetwork.embedding import HLouvainEmbedding
>>> from sknetwork.data import karate_club
>>> hlouvain = HLouvainEmbedding(n_components=3)
>>> adjacency = karate_club()
>>> embedding = hlouvain.fit_transform(adjacency)
>>> embedding.shape
(34, 3)


References

Bhowmick, A. K., Meneni, K., Danisch, M., Guillaume, J. L., & Mitra, B. (2020, January). LouvainNE: Hierarchical Louvain Method for High Quality and Scalable Network Embedding. In Proceedings of the 13th International Conference on Web Search and Data Mining (pp. 43-51).

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])[source]

Embedding of graphs from a clustering obtained with Louvain.

Parameters

adjacency – Adjacency matrix of the graph.

Returns

self

Return type

LouvainNE

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

class sknetwork.embedding.BiHLouvainEmbedding(n_components: int = 2, scale: float = 0.1, resolution: float = 1, tol_optimization: float = 0.001, tol_aggregation: float = 0.001, n_aggregations: int = - 1, shuffle_nodes: bool = False, random_state: Optional[Union[numpy.random.mtrand.RandomState, int]] = None, verbose: bool = False)[source]

Embedding of graphs based on random vectors and clustering by the Louvain method.

Parameters
• n_components (int) – Dimension of the embedding.

• scale (float) – Dilution factor to be applied on the random vector to be added at each iteration of the clustering method.

• resolution – Resolution parameter.

• tol_optimization – Minimum increase in the objective function to enter a new optimization pass.

• tol_aggregation – Minimum increase in the objective function to enter a new aggregation pass.

• n_aggregations – Maximum number of aggregations. A negative value is interpreted as no limit.

• shuffle_nodes – Enables node shuffling before optimization.

• random_state – Random number generator or random seed. If None, numpy.random is used.

Variables
• embedding_ (array, shape = (n_row, n_components)) – Embedding of the rows.

• embedding_row_ (array, shape = (n_row, n_components)) – Embedding of the rows (copy of embedding_).

• embedding_col_ (array, shape = (n_col, n_components)) – Embedding of the columns.

• eigenvalues_ (array, shape = (n_components)) – Eigenvalues in increasing order (first eigenvalue ignored).

• eigenvectors_ (array, shape = (n, n_components)) – Corresponding eigenvectors.

• regularization_ (None or float) – Regularization factor added to all pairs of nodes.

Example

>>> from sknetwork.embedding import BiHLouvainEmbedding
>>> from sknetwork.data import movie_actor
>>> bihlouvain = BiHLouvainEmbedding()
>>> biadjacency = movie_actor()
>>> embedding = bihlouvain.fit_transform(biadjacency)
>>> embedding.shape
(15, 2)


References

Belkin, M. & Niyogi, P. (2003). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation, Neural computation.

fit(biadjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.embedding.louvain_hierarchy.BiHLouvainEmbedding[source]

Embedding of graphs from a clustering obtained with Louvain.

Parameters

biadjacency – Biadjacency matrix of the graph.

Returns

self

Return type

BiLouvainNE

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

## Force Atlas¶

class sknetwork.embedding.ForceAtlas(n_components: int = 2, n_iter: int = 50, approx_radius: float = - 1, lin_log: bool = False, gravity_factor: float = 0.01, repulsive_factor: float = 0.1, tolerance: float = 0.1, speed: float = 0.1, speed_max: float = 10)[source]

Force Atlas layout for displaying graphs.

• Graphs

• Digraphs

Parameters
• n_components (int) – Dimension of the graph layout.

• n_iter (int) – Number of iterations to update positions. If None, use the value of self.n_iter.

• approx_radius (float) – If a positive value is provided, only the nodes within this distance a given node are used to compute the repulsive force.

• lin_log (bool) – If True, use lin-log mode.

• gravity_factor (float) – Gravity force scaling constant.

• repulsive_factor (float) – Repulsive force scaling constant.

• tolerance (float) – Tolerance defined in the swinging constant.

• speed (float) – Speed constant.

• speed_max (float) – Constant used to impose constrain on speed.

Variables

embedding_ (np.ndarray) – Layout in given dimension.

Example

>>> from sknetwork.embedding.force_atlas import ForceAtlas
>>> from sknetwork.data import karate_club
>>> force_atlas = ForceAtlas()
>>> adjacency = karate_club()
>>> embedding = force_atlas.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


References

Jacomy M., Venturini T., Heymann S., Bastian M. (2014). ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software. Plos One.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray], pos_init: Optional[numpy.ndarray] = None, n_iter: Optional[int] = None)sknetwork.embedding.force_atlas.ForceAtlas[source]

Compute layout.

Parameters
• adjacency – Adjacency matrix of the graph, treated as undirected.

• pos_init – Position to start with. Random if not provided.

• n_iter (int) – Number of iterations to update positions. If None, use the value of self.n_iter.

Returns

self

Return type

ForceAtlas

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

## Spring¶

class sknetwork.embedding.Spring(n_components: int = 2, strength: Optional[float] = None, n_iter: int = 50, tol: float = 0.0001, approx_radius: float = - 1, position_init: str = 'random')[source]

Spring layout for displaying small graphs.

• Graphs

• Digraphs

Parameters
• n_components (int) – Dimension of the graph layout.

• strength (float) – Intensity of the force that moves the nodes.

• n_iter (int) – Number of iterations to update positions.

• tol (float) – Minimum relative change in positions to continue updating.

• approx_radius (float) – If a positive value is provided, only the nodes within this distance a given node are used to compute the repulsive force.

• position_init (str) – How to initialize the layout. If ‘spectral’, use Spectral embedding in dimension 2, otherwise, use random initialization.

Variables

embedding_ (np.ndarray) – Layout.

Example

>>> from sknetwork.embedding import Spring
>>> from sknetwork.data import karate_club
>>> spring = Spring()
>>> adjacency = karate_club()
>>> embedding = spring.fit_transform(adjacency)
>>> embedding.shape
(34, 2)


Notes

Simple implementation designed to display small graphs.

References

Fruchterman, T. M. J., Reingold, E. M. (1991). Graph Drawing by Force-Directed Placement. Software – Practice & Experience.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray], position_init: Optional[numpy.ndarray] = None, n_iter: Optional[int] = None)sknetwork.embedding.spring.Spring[source]

Compute layout.

Parameters
• adjacency – Adjacency matrix of the graph, treated as undirected.

• position_init (np.ndarray) – Custom initial positions of the nodes. Shape must be (n, 2). If None, use the value of self.pos_init.

• n_iter (int) – Number of iterations to update positions. If None, use the value of self.n_iter.

Returns

self

Return type

Spring

fit_transform(*args, **kwargs)numpy.ndarray

Fit to data and return the embedding. Same parameters as the fit method.

Returns

embedding – Embedding.

Return type

np.ndarray

predict(adjacency_vectors: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])numpy.ndarray[source]

Predict the embedding of new rows, defined by their adjacency vectors.

Parameters

adjacency_vectors – Adjacency vectors of nodes. Array of shape (n_col,) (single vector) or (n_vectors, n_col)

Returns

embedding_vectors – Embedding of the nodes.

Return type

np.ndarray

## Metrics¶

sknetwork.embedding.cosine_modularity(adjacency, embedding: numpy.ndarray, embedding_col=None, resolution=1.0, weights='degree', return_all: bool = False)[source]

Quality metric of an embedding $$x$$ defined by:

$$Q = \sum_{ij}\left(\dfrac{A_{ij}}{w} - \gamma \dfrac{w^+_iw^-_j}{w^2}\right) \left(\dfrac{1 + \cos(x_i, x_j)}{2}\right)$$

where

• $$w^+_i, w^-_i$$ are the out-weight, in-weight of node $$i$$ (for digraphs),

• $$w = 1^TA1$$ is the total weight of the graph.

For bipartite graphs with column embedding $$y$$, the metric is

$$Q = \sum_{ij}\left(\dfrac{B_{ij}}{w} - \gamma \dfrac{w_{1,i}w_{2,j}}{w^2}\right) \left(\dfrac{1 + \cos(x_i, y_j)}{2}\right)$$

where

• $$w_{1,i}, w_{2,j}$$ are the weights of nodes $$i$$ (row) and $$j$$ (column),

• $$w = 1^TB1$$ is the total weight of the graph.

Parameters
• adjacency – Adjacency matrix of the graph.

• embedding – Embedding of the nodes.

• embedding_col – Embedding of the columns (for bipartite graphs).

• resolution – Resolution parameter.

• weights ('degree' or 'uniform') – Weights of the nodes.

• return_all – If True, also return fit and diversity

Returns

• modularity (float)

• fit (float, optional)

• diversity (float, optional)

Example

>>> from sknetwork.embedding import cosine_modularity
>>> from sknetwork.data import karate_club
>>> graph = karate_club(metadata=True)
>>> adjacency = graph.adjacency
>>> embedding = graph.position
>>> np.round(cosine_modularity(adjacency, embedding), 2)
0.35