Utils

Various tools for graph analysis.

Convert graphs

sknetwork.utils.directed2undirected(adjacency: Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR], weighted: bool = True) Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR][source]

Adjacency matrix of the undirected graph associated with some directed graph.

The new adjacency matrix becomes either:

\(A+A^T\) (default)

or

\(\max(A,A^T)\)

If the initial adjacency matrix \(A\) is binary, bidirectional edges have weight 2 (first method, default) or 1 (second method).

Parameters
  • adjacency – Adjacency matrix.

  • weighted – If True, return the sum of the weights in both directions of each edge.

Returns

New adjacency matrix (same format as input).

Return type

new_adjacency

sknetwork.utils.bipartite2undirected(biadjacency: Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR]) Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR][source]

Adjacency matrix of a bigraph defined by its biadjacency matrix.

The returned adjacency matrix is:

\(A = \begin{bmatrix} 0 & B \\ B^T & 0 \end{bmatrix}\)

where \(B\) is the biadjacency matrix of the bipartite graph.

Parameters

biadjacency – Biadjacency matrix of the graph.

Returns

Adjacency matrix (same format as input).

Return type

adjacency

sknetwork.utils.bipartite2directed(biadjacency: Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR]) Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR][source]

Adjacency matrix of the directed graph associated with a bipartite graph (with edges from one part to the other).

The returned adjacency matrix is:

\(A = \begin{bmatrix} 0 & B \\ 0 & 0 \end{bmatrix}\)

where \(B\) is the biadjacency matrix.

Parameters

biadjacency – Biadjacency matrix of the graph.

Returns

Adjacency matrix (same format as input).

Return type

adjacency

Neighborhood

sknetwork.utils.get_degrees(input_matrix: scipy.sparse._csr.csr_matrix, transpose: bool = False) numpy.ndarray[source]

Get the vector of degrees of a graph.

If the graph is directed, returns the out-degrees (number of successors). Set transpose=True to get the in-degrees (number of predecessors).

For a biadjacency matrix, returns the degrees of rows. Set transpose=True to get the degrees of columns.

Parameters
  • input_matrix (sparse.csr_matrix) – Adjacency or biadjacency matrix.

  • transpose – If True, transpose the input matrix.

Returns

degrees – Array of degrees.

Return type

np.ndarray

Example

>>> from sknetwork.data import house
>>> adjacency = house()
>>> get_degrees(adjacency)
array([2, 3, 2, 2, 3], dtype=int32)
sknetwork.utils.get_weights(input_matrix: scipy.sparse._csr.csr_matrix, transpose: bool = False) numpy.ndarray[source]

Get the vector of weights of the nodes of a graph. If the graph is not weighted, return the vector of degrees.

If the graph is directed, returns the out-weights (total weight of outgoing links). Set transpose=True to get the in-weights (total weight of incoming links).

For a biadjacency matrix, returns the weights of rows. Set transpose=True to get the weights of columns.

Parameters
  • input_matrix (sparse.csr_matrix) – Adjacency or biadjacency matrix.

  • transpose – If True, transpose the input matrix.

Returns

weights – Array of weights.

Return type

np.ndarray

Example

>>> from sknetwork.data import house
>>> adjacency = house()
>>> get_weights(adjacency)
array([2., 3., 2., 2., 3.])
sknetwork.utils.get_neighbors(input_matrix: scipy.sparse._csr.csr_matrix, node: int, transpose: bool = False) numpy.ndarray[source]

Get the neighbors of a node.

If the graph is directed, returns the vector of successors. Set transpose=True to get the predecessors.

For a biadjacency matrix, returns the neighbors of a row node. Set transpose=True to get the neighbors of a column node.

Parameters
  • input_matrix (sparse.csr_matrix) – Adjacency or biadjacency matrix.

  • node (int) – Target node.

  • transpose – If True, transpose the input matrix.

Returns

neighbors – Array of neighbors of the target node.

Return type

np.ndarray

Example

>>> from sknetwork.data import house
>>> adjacency = house()
>>> get_neighbors(adjacency, node=0)
array([1, 4], dtype=int32)

Membership matrix

sknetwork.utils.get_membership(labels: numpy.ndarray, dtype=<class 'bool'>, n_labels: Optional[int] = None) scipy.sparse._csr.csr_matrix[source]

Build the binary matrix of the label assignments, of shape n_samples x n_labels. Negative labels are ignored.

Parameters
  • labels – Label of each node.

  • dtype – Type of the entries. Boolean by default.

  • n_labels (int) – Number of labels.

Returns

membership – Binary matrix of label assignments.

Return type

sparse.csr_matrix

Example

>>> from sknetwork.utils import get_membership
>>> labels = np.array([0, 0, 1, 2])
>>> membership = get_membership(labels)
>>> membership.toarray().astype(int)
array([[1, 0, 0],
       [1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])
sknetwork.utils.from_membership(membership: scipy.sparse._csr.csr_matrix) numpy.ndarray[source]

Get the labels from a membership matrix (n_samples x n_labels). Samples without label get -1.

Parameters

membership – Membership matrix.

Returns

labels – Labels (columns indices of the membership matrix).

Return type

np.ndarray

Example

>>> from scipy import sparse
>>> from sknetwork.utils import from_membership
>>> membership = sparse.eye(3).tocsr()
>>> labels = from_membership(membership)
>>> labels
array([0, 1, 2])

TF-IDF

sknetwork.utils.get_tfidf(count_matrix: scipy.sparse._csr.csr_matrix)[source]

Get the tf-idf from a count matrix in sparse format.

Parameters

count_matrix (sparse.csr_matrix) – Count matrix, shape (n_documents, n_words).

Returns

tf_idf – tf-idf matrix, shape (n_documents, n_words).

Return type

sparse.csr_matrix

References

https://en.wikipedia.org/wiki/Tfidf