Utils

Various tools for graph analysis.

Convert graphs

sknetwork.utils.directed2undirected(adjacency: csr_matrix | SparseLR, weighted: bool = True) → csr_matrix | SparseLR[source]

Adjacency matrix of the undirected graph associated with some directed graph.

The new adjacency matrix becomes either:

\(A+A^T\) (default)

or

\(\max(A,A^T) > 0\) (binary)

If the initial adjacency matrix \(A\) is binary, bidirectional edges have weight 2 (first method, default) or 1 (second method).

Parameters:

adjacency – Adjacency matrix.
weighted – If True, return the sum of the weights in both directions of each edge.

Returns:

New adjacency matrix (same format as input).

Return type:

new_adjacency

sknetwork.utils.bipartite2undirected(biadjacency: csr_matrix | SparseLR) → csr_matrix | SparseLR[source]

Adjacency matrix of a bigraph defined by its biadjacency matrix.

The returned adjacency matrix is:

\(A = \begin{bmatrix} 0 & B \\ B^T & 0 \end{bmatrix}\)

where \(B\) is the biadjacency matrix of the bipartite graph.

Parameters:: biadjacency – Biadjacency matrix of the graph.
Returns:: Adjacency matrix (same format as input).
Return type:: adjacency

sknetwork.utils.bipartite2directed(biadjacency: csr_matrix | SparseLR) → csr_matrix | SparseLR[source]

Adjacency matrix of the directed graph associated with a bipartite graph (with edges from one part to the other).

The returned adjacency matrix is:

\(A = \begin{bmatrix} 0 & B \\ 0 & 0 \end{bmatrix}\)

where \(B\) is the biadjacency matrix.

Parameters:: biadjacency – Biadjacency matrix of the graph.
Returns:: Adjacency matrix (same format as input).
Return type:: adjacency

Neighborhood

sknetwork.utils.get_degrees(input_matrix: csr_matrix, transpose: bool = False) → ndarray[source]

Get the vector of degrees of a graph.

If the graph is directed, returns the out-degrees (number of successors). Set transpose=True to get the in-degrees (number of predecessors).

For a biadjacency matrix, returns the degrees of rows. Set transpose=True to get the degrees of columns.

Parameters:

input_matrix (sparse.csr_matrix) – Adjacency or biadjacency matrix.
transpose – If True, transpose the input matrix.

Returns:

degrees – Array of degrees.

Return type:

np.ndarray

Example

>>> from sknetwork.data import house
>>> adjacency = house()
>>> get_degrees(adjacency)
array([2, 3, 2, 2, 3], dtype=int32)

sknetwork.utils.get_weights(input_matrix: csr_matrix, transpose: bool = False) → ndarray[source]

Get the vector of weights of the nodes of a graph. If the graph is not weighted, return the vector of degrees.

If the graph is directed, returns the out-weights (total weight of outgoing links). Set transpose=True to get the in-weights (total weight of incoming links).

For a biadjacency matrix, returns the weights of rows. Set transpose=True to get the weights of columns.

Parameters:

input_matrix (sparse.csr_matrix) – Adjacency or biadjacency matrix.
transpose – If True, transpose the input matrix.

Returns:

weights – Array of weights.

Return type:

np.ndarray

Example

>>> from sknetwork.data import house
>>> adjacency = house()
>>> get_weights(adjacency)
array([2., 3., 2., 2., 3.])

sknetwork.utils.get_neighbors(input_matrix: csr_matrix, node: int, transpose: bool = False) → ndarray[source]

Get the neighbors of a node.

If the graph is directed, returns the vector of successors. Set transpose=True to get the predecessors.

For a biadjacency matrix, returns the neighbors of a row node. Set transpose=True to get the neighbors of a column node.

Parameters:

input_matrix (sparse.csr_matrix) – Adjacency or biadjacency matrix.
node (int) – Target node.
transpose – If True, transpose the input matrix.

Returns:

neighbors – Array of neighbors of the target node.

Return type:

np.ndarray

Example

>>> from sknetwork.data import house
>>> adjacency = house()
>>> get_neighbors(adjacency, node=0)
array([1, 4], dtype=int32)

Membership matrix

sknetwork.utils.get_membership(labels: ~numpy.ndarray, dtype=<class 'bool'>, n_labels: int | None = None) → csr_matrix[source]

Build the binary matrix of the label assignments, of shape n_samples x n_labels. Negative labels are ignored.

Parameters:

labels – Label of each node (integers).
dtype – Type of the output. Boolean by default.
n_labels (int) – Number of labels.

Returns:

membership – Binary matrix of label assignments.

Return type:

sparse.csr_matrix

Example

>>> from sknetwork.utils import get_membership
>>> labels = np.array([0, 0, 1, 2])
>>> membership = get_membership(labels)
>>> membership.toarray().astype(int)
array([[1, 0, 0],
       [1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])

sknetwork.utils.from_membership(membership: csr_matrix) → ndarray[source]

Get the labels from a membership matrix (n_samples x n_labels). Samples without label get -1.

Parameters:: membership – Membership matrix.
Returns:: labels – Labels (columns indices of the membership matrix).
Return type:: np.ndarray

Example

>>> from scipy import sparse
>>> from sknetwork.utils import from_membership
>>> membership = sparse.eye(3).tocsr()
>>> labels = from_membership(membership)
>>> labels
array([0, 1, 2])

TF-IDF

sknetwork.utils.get_tfidf(count_matrix: csr_matrix)[source]

Get the tf-idf from a count matrix in sparse format.

Parameters:: count_matrix (sparse.csr_matrix) – Count matrix, shape (n_documents, n_words).
Returns:: tf_idf – tf-idf matrix, shape (n_documents, n_words).
Return type:: sparse.csr_matrix

References

https://en.wikipedia.org/wiki/Tfidf