# Utils

Various tools for graph analysis.

## Convert graphs

sknetwork.utils.directed2undirected(adjacency: Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR], weighted: bool = True) Union[scipy.sparse._csr.csr_matrix, sknetwork.linalg.sparse_lowrank.SparseLR][source]

Adjacency matrix of the undirected graph associated with some directed graph.

The new adjacency matrix becomes either:

$$A+A^T$$ (default)

or

$$\max(A,A^T)$$

If the initial adjacency matrix $$A$$ is binary, bidirectional edges have weight 2 (first method, default) or 1 (second method).

Parameters

• weighted – If True, return the sum of the weights in both directions of each edge.

Returns

New adjacency matrix (same format as input).

Return type

$$A = \begin{bmatrix} 0 & B \\ B^T & 0 \end{bmatrix}$$

where $$B$$ is the biadjacency matrix of the bipartite graph.

Parameters

Returns

Adjacency matrix (same format as input).

Return type

Adjacency matrix of the directed graph associated with a bipartite graph (with edges from one part to the other).

$$A = \begin{bmatrix} 0 & B \\ 0 & 0 \end{bmatrix}$$

where $$B$$ is the biadjacency matrix.

Parameters

Returns

Adjacency matrix (same format as input).

Return type

## Neighborhood

sknetwork.utils.get_degrees(input_matrix: scipy.sparse._csr.csr_matrix, transpose: bool = False) numpy.ndarray[source]

Get the vector of degrees of a graph.

If the graph is directed, returns the out-degrees (number of successors). Set transpose=True to get the in-degrees (number of predecessors).

For a biadjacency matrix, returns the degrees of rows. Set transpose=True to get the degrees of columns.

Parameters

• transpose – If True, transpose the input matrix.

Returns

degrees – Array of degrees.

Return type

np.ndarray

Example

>>> from sknetwork.data import house
array([2, 3, 2, 2, 3], dtype=int32)

sknetwork.utils.get_weights(input_matrix: scipy.sparse._csr.csr_matrix, transpose: bool = False) numpy.ndarray[source]

Get the vector of weights of the nodes of a graph. If the graph is not weighted, return the vector of degrees.

If the graph is directed, returns the out-weights (total weight of outgoing links). Set transpose=True to get the in-weights (total weight of incoming links).

For a biadjacency matrix, returns the weights of rows. Set transpose=True to get the weights of columns.

Parameters

• transpose – If True, transpose the input matrix.

Returns

weights – Array of weights.

Return type

np.ndarray

Example

>>> from sknetwork.data import house
array([2., 3., 2., 2., 3.])

sknetwork.utils.get_neighbors(input_matrix: scipy.sparse._csr.csr_matrix, node: int, transpose: bool = False) numpy.ndarray[source]

Get the neighbors of a node.

If the graph is directed, returns the vector of successors. Set transpose=True to get the predecessors.

For a biadjacency matrix, returns the neighbors of a row node. Set transpose=True to get the neighbors of a column node.

Parameters

• node (int) – Target node.

• transpose – If True, transpose the input matrix.

Returns

neighbors – Array of neighbors of the target node.

Return type

np.ndarray

Example

>>> from sknetwork.data import house
array([1, 4], dtype=int32)


## Membership matrix

sknetwork.utils.get_membership(labels: numpy.ndarray, dtype=<class 'bool'>, n_labels: Optional[int] = None) scipy.sparse._csr.csr_matrix[source]

Build the binary matrix of the label assignments, of shape n_samples x n_labels. Negative labels are ignored.

Parameters
• labels – Label of each node.

• dtype – Type of the entries. Boolean by default.

• n_labels (int) – Number of labels.

Returns

membership – Binary matrix of label assignments.

Return type

sparse.csr_matrix

Example

>>> from sknetwork.utils import get_membership
>>> labels = np.array([0, 0, 1, 2])
>>> membership = get_membership(labels)
>>> membership.toarray().astype(int)
array([[1, 0, 0],
[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])

sknetwork.utils.from_membership(membership: scipy.sparse._csr.csr_matrix) numpy.ndarray[source]

Get the labels from a membership matrix (n_samples x n_labels). Samples without label get -1.

Parameters

membership – Membership matrix.

Returns

labels – Labels (columns indices of the membership matrix).

Return type

np.ndarray

Example

>>> from scipy import sparse
>>> from sknetwork.utils import from_membership
>>> membership = sparse.eye(3).tocsr()
>>> labels = from_membership(membership)
>>> labels
array([0, 1, 2])


## TF-IDF

sknetwork.utils.get_tfidf(count_matrix: scipy.sparse._csr.csr_matrix)[source]

Get the tf-idf from a count matrix in sparse format.

Parameters

count_matrix (sparse.csr_matrix) – Count matrix, shape (n_documents, n_words).

Returns

tf_idf – tf-idf matrix, shape (n_documents, n_words).

Return type

sparse.csr_matrix

References

https://en.wikipedia.org/wiki/Tfidf