Ranking

Node ranking algorithms.

The attribute scores_ assigns a score of importance to each node of the graph.

PageRank

class sknetwork.ranking.PageRank(damping_factor: float = 0.85, solver: str = 'piteration', n_iter: int = 10, tol: float = 1e-06)[source]

PageRank of each node, corresponding to its frequency of visit by a random walk.

The random walk restarts with some fixed probability. The restart distribution can be personalized by the user. This variant is known as Personalized PageRank.

Parameters
  • damping_factor (float) – Probability to continue the random walk.

  • solver (str) –

    • 'piteration', use power iteration for a given number of iterations.

    • 'diteration', use asynchronous parallel diffusion for a given number of iterations.

    • 'lanczos', use eigensolver with a given tolerance.

    • 'bicgstab', use Biconjugate Gradient Stabilized method for a given tolerance.

    • 'RH', use a Ruffini-Horner polynomial evaluation.

    • 'push', use push-based algorithm for a given tolerance

  • n_iter (int) – Number of iterations for some solvers.

  • tol (float) – Tolerance for the convergence of some solvers.

Variables
  • scores_ (np.ndarray) – PageRank score of each node.

  • scores_row_ (np.ndarray) – Scores of rows, for bipartite graphs.

  • scores_col_ (np.ndarray) – Scores of columns, for bipartite graphs.

Example

>>> from sknetwork.ranking import PageRank
>>> from sknetwork.data import house
>>> pagerank = PageRank()
>>> adjacency = house()
>>> weights = {0: 1}
>>> scores = pagerank.fit_predict(adjacency, weights)
>>> np.round(scores, 2)
array([0.29, 0.24, 0.12, 0.12, 0.24])

References

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford InfoLab.

fit(input_matrix: Union[scipy.sparse._csr.csr_matrix, numpy.ndarray, scipy.sparse.linalg._interface.LinearOperator], weights: Optional[Union[numpy.ndarray, dict]] = None, weights_row: Optional[Union[numpy.ndarray, dict]] = None, weights_col: Optional[Union[numpy.ndarray, dict]] = None, force_bipartite: bool = False) sknetwork.ranking.pagerank.PageRank[source]

Fit algorithm to data.

Parameters
  • input_matrix – Adjacency matrix or biadjacency matrix of the graph.

  • weights – Parameter to be used for Personalized PageRank. Restart distribution as a vector or a dict (node: weight). If None, the uniform distribution is used (no personalization, default).

  • weights_row – Parameter to be used for Personalized PageRank on bipartite graphs. Restart distribution as vectors or dicts on rows, columns (node: weight). If both weights_row and weights_col are None (default), the uniform distribution on rows is used.

  • weights_col – Parameter to be used for Personalized PageRank on bipartite graphs. Restart distribution as vectors or dicts on rows, columns (node: weight). If both weights_row and weights_col are None (default), the uniform distribution on rows is used.

  • force_bipartite – If True, consider the input matrix as the biadjacency matrix of a bipartite graph.

Returns

self

Return type

PageRank

fit_predict(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

fit_transform(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Alias for fit_predict. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

get_params()

Get parameters as dictionary.

Returns

params – Parameters of the algorithm.

Return type

dict

set_params(params: dict) sknetwork.base.Algorithm

Set parameters of the algorithm.

Parameters

params (dict) – Parameters of the algorithm.

Returns

self

Return type

Algorithm

Katz

class sknetwork.ranking.Katz(damping_factor: float = 0.5, path_length: int = 4)[source]

Katz centrality, defined by:

\(\sum_{k=1}^K\alpha^k(A^k)^T\mathbf{1}\)

where \(A\) is the adjacency matrix, \(\alpha\) is the damping factor and \(K\) is the path length.

Parameters
  • damping_factor (float) – Damping factor for path contributions.

  • path_length (int) – Maximum length of the paths.

Variables
  • scores_ (np.ndarray) – Score of each node.

  • scores_row_ (np.ndarray) – Scores of rows, for bipartite graphs.

  • scores_col_ (np.ndarray) – Scores of columns, for bipartite graphs.

Examples

>>> from sknetwork.data.toy_graphs import house
>>> adjacency = house()
>>> katz = Katz()
>>> scores = katz.fit_predict(adjacency)
>>> np.round(scores, 2)
array([6.5 , 8.25, 5.62, 5.62, 8.25])

References

Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39-43.

fit(input_matrix: Union[scipy.sparse._csr.csr_matrix, numpy.ndarray, scipy.sparse.linalg._interface.LinearOperator]) sknetwork.ranking.katz.Katz[source]

Katz centrality.

Parameters

input_matrix – Adjacency matrix or biadjacency matrix of the graph.

Returns

self

Return type

Katz

fit_predict(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

fit_transform(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Alias for fit_predict. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

get_params()

Get parameters as dictionary.

Returns

params – Parameters of the algorithm.

Return type

dict

set_params(params: dict) sknetwork.base.Algorithm

Set parameters of the algorithm.

Parameters

params (dict) – Parameters of the algorithm.

Returns

self

Return type

Algorithm

HITS

class sknetwork.ranking.HITS(solver: Union[str, sknetwork.linalg.svd_solver.SVDSolver] = 'lanczos')[source]

Hub and authority scores of each node. For bipartite graphs, the hub score is computed on rows and the authority score on columns.

Parameters

solver ('lanczos' (default, Lanczos algorithm) or SVDSolver (custom solver)) – Which solver to use.

Variables
  • scores_ (np.ndarray) – Hub score of each node.

  • scores_row_ (np.ndarray) – Hub score of each row, for bipartite graphs.

  • scores_col_ (np.ndarray) – Authority score of each column, for bipartite graphs.

Example

>>> from sknetwork.ranking import HITS
>>> from sknetwork.data import star_wars
>>> hits = HITS()
>>> biadjacency = star_wars()
>>> scores = hits.fit_predict(biadjacency)
>>> np.round(scores, 2)
array([0.5 , 0.23, 0.69, 0.46])

References

Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), 604-632.

fit(adjacency: Union[scipy.sparse._csr.csr_matrix, numpy.ndarray]) sknetwork.ranking.hits.HITS[source]

Compute HITS algorithm with a spectral method.

Parameters

adjacency – Adjacency or biadjacency matrix of the graph.

Returns

self

Return type

HITS

fit_predict(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

fit_transform(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Alias for fit_predict. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

get_params()

Get parameters as dictionary.

Returns

params – Parameters of the algorithm.

Return type

dict

set_params(params: dict) sknetwork.base.Algorithm

Set parameters of the algorithm.

Parameters

params (dict) – Parameters of the algorithm.

Returns

self

Return type

Algorithm

Betweenness centrality

class sknetwork.ranking.Betweenness(normalized: bool = False)

Betweenness centrality, based on Brandes’ algorithm.

Variables

scores_ (np.ndarray) – Betweenness centrality value of each node

Example

>>> from sknetwork.ranking import Betweenness
>>> from sknetwork.data.toy_graphs import bow_tie
>>> betweenness = Betweenness()
>>> adjacency = bow_tie()
>>> scores = betweenness.fit_transform(adjacency)
>>> scores
array([4., 0., 0., 0., 0.])

References

Brandes, Ulrik (2001). A faster algorithm for betweenness centrality. Journal of Mathematical Sociology.

fit(adjacency: Union[scipy.sparse._csr.csr_matrix, numpy.ndarray]) sknetwork.ranking.betweenness.Betweenness

Fit algorithm to data.

fit_predict(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

fit_transform(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Alias for fit_predict. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

get_params()

Get parameters as dictionary.

Returns

params – Parameters of the algorithm.

Return type

dict

set_params(params: dict) sknetwork.base.Algorithm

Set parameters of the algorithm.

Parameters

params (dict) – Parameters of the algorithm.

Returns

self

Return type

Algorithm

Closeness centrality

class sknetwork.ranking.Closeness(method: str = 'exact', tol: float = 0.1)[source]

Ranking by closeness centrality of each node in a connected graph, corresponding to the average length of the shortest paths from that node to all the other ones.

Parameters
  • method – Denotes if the results should be exact or approximate.

  • tol (float) – If method=='approximate', the allowed tolerance on each score entry.

Variables

scores_ (np.ndarray) – Closeness centrality of each node.

Example

>>> from sknetwork.ranking import Closeness
>>> from sknetwork.data import cyclic_digraph
>>> closeness = Closeness()
>>> adjacency = cyclic_digraph(3)
>>> scores = closeness.fit_predict(adjacency)
>>> np.round(scores, 2)
array([0.67, 0.67, 0.67])

References

Eppstein, D., & Wang, J. (2001, January). Fast approximation of centrality. In Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms (pp. 228-229). Society for Industrial and Applied Mathematics.

fit(adjacency: Union[scipy.sparse._csr.csr_matrix, numpy.ndarray]) sknetwork.ranking.closeness.Closeness[source]

Closeness centrality for connected graphs.

Parameters

adjacency – Adjacency matrix of the graph.

Returns

self

Return type

Closeness

fit_predict(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

fit_transform(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Alias for fit_predict. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

get_params()

Get parameters as dictionary.

Returns

params – Parameters of the algorithm.

Return type

dict

set_params(params: dict) sknetwork.base.Algorithm

Set parameters of the algorithm.

Parameters

params (dict) – Parameters of the algorithm.

Returns

self

Return type

Algorithm

Post-processing

sknetwork.ranking.top_k(scores: numpy.ndarray, k: int = 1, sort: bool = True)[source]

Return the indices of the k elements of highest values.

Parameters
  • scores (np.ndarray) – Array of values.

  • k (int) – Number of elements to return.

  • sort (bool) – If True, sort the indices in decreasing order of value (element of highest value first).

Examples

>>> top_k([1, 3, 2], k=2)
array([1, 2])