# Ranking¶

Node ranking algorithms.

The attribute scores_ assigns a score of importance to each node of the graph.

## PageRank¶

class sknetwork.ranking.PageRank(damping_factor: float = 0.85, solver: str = 'piteration', n_iter: int = 10, tol: float = 1e-06)[source]

PageRank of each node, corresponding to its frequency of visit by a random walk.

The random walk restarts with some fixed probability. The restart distribution can be personalized by the user. This variant is known as Personalized PageRank.

• Graphs

• Digraphs

Parameters
• damping_factor (float) – Probability to continue the random walk.

• solver (str) –

• 'piteration', use power iteration for a given number of iterations.

• 'diteration', use asynchronous parallel diffusion for a given number of iterations.

• 'lanczos', use eigensolver with a given tolerance.

• 'bicgstab', use Biconjugate Gradient Stabilized method for a given tolerance.

• 'RH', use a Ruffini-Horner polynomial evaluation.

• 'push', use push-based algorithm for a given tolerance

• n_iter (int) – Number of iterations for some solvers.

• tol (float) – Tolerance for the convergence of some solvers.

Variables

scores_ (np.ndarray) – PageRank score of each node.

Example

>>> from sknetwork.ranking import PageRank
>>> from sknetwork.data import house
>>> pagerank = PageRank()
>>> seeds = {0: 1}
>>> np.round(scores, 2)
array([0.29, 0.24, 0.12, 0.12, 0.24])


References

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford InfoLab.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray, scipy.sparse.linalg.interface.LinearOperator], seeds: Optional[Union[numpy.ndarray, dict]] = None)sknetwork.ranking.pagerank.PageRank[source]

Fit algorithm to data.

Parameters

• seeds – Parameter to be used for Personalized PageRank. Restart distribution as a vector or a dict (node: weight). If None, the uniform distribution is used (no personalization, default).

Returns

self

Return type

PageRank

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

class sknetwork.ranking.BiPageRank(damping_factor: float = 0.85, solver: str = 'piteration', n_iter: int = 10, tol: float = 0)[source]

Compute the PageRank of each node through a random walk in the bipartite graph.

• Bigraphs

Parameters
• damping_factor (float) – Probability to continue the random walk.

• solver (str) –

• piteration, use power iteration for a given number of iterations.

• diteration, use asynchronous parallel diffusion for a given number of iterations.

• lanczos, use eigensolver for a given tolerance.

• bicgstab, use Biconjugate Gradient Stabilized method for a given tolerance.

• n_iter (int) – Number of iterations for some solvers.

• tol (float) – Tolerance for the convergence of some solvers.

Variables
• scores_ (np.ndarray) – PageRank score of each row.

• scores_row_ (np.ndarray) – PageRank score of each row (copy of scores_).

• scores_col_ (np.ndarray) – PageRank score of each column.

Example

>>> from sknetwork.ranking import BiPageRank
>>> from sknetwork.data import star_wars
>>> bipagerank = BiPageRank()
>>> seeds = {0: 1}
>>> np.round(scores, 2)
array([0.45, 0.11, 0.28, 0.17])

fit(biadjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray], seeds_row: Optional[Union[numpy.ndarray, dict]] = None, seeds_col: Optional[Union[numpy.ndarray, dict]] = None)sknetwork.ranking.pagerank.BiPageRank[source]

Fit algorithm to data.

Parameters

• seeds_row – Parameter to be used for Personalized BiPageRank. Restart distribution as vectors or dicts on rows, columns (node: weight). If both seeds_row and seeds_col are None (default), the uniform distribution on rows is used.

• seeds_col – Parameter to be used for Personalized BiPageRank. Restart distribution as vectors or dicts on rows, columns (node: weight). If both seeds_row and seeds_col are None (default), the uniform distribution on rows is used.

Returns

self

Return type

BiPageRank

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## Diffusion¶

class sknetwork.ranking.Diffusion(n_iter: int = 3, damping_factor: Optional[float] = None)[source]

Ranking by diffusion along the edges (heat equation).

• Graphs

• Digraphs

Parameters
• n_iter (int) – Number of steps of the diffusion in discrete time (must be positive).

• damping_factor (float (optional)) – Damping factor (default value = 1).

Variables

scores_ (np.ndarray) – Score of each node (= temperature).

Example

>>> from sknetwork.data import house
>>> diffusion = Diffusion(n_iter=2)
>>> seeds = {0: 1, 2: 0}
>>> np.round(scores, 2)
array([0.58, 0.56, 0.38, 0.58, 0.42])


References

Chung, F. (2007). The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray], seeds: Optional[Union[numpy.ndarray, dict]] = None, init: Optional[float] = None)sknetwork.ranking.diffusion.Diffusion[source]

Compute the diffusion (temperatures at equilibrium).

Parameters

• seeds – Temperatures of seed nodes in initial state (dictionary or vector). Negative temperatures ignored.

• init – Temperature of non-seed nodes in initial state. If None, use the average temperature of seed nodes (default).

Returns

self

Return type

Diffusion

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

class sknetwork.ranking.BiDiffusion(n_iter: int = 3, damping_factor: Optional[float] = None)[source]

Ranking by diffusion along the edges of a bipartite graph (heat equation).

• Bigraphs

Parameters
• n_iter (int) – Number of steps of the diffusion in discrete time (must be positive).

• damping_factor (float (optional)) – Damping factor (default value = 1).

Variables
• scores_ (np.ndarray) – Scores of rows.

• scores_row_ (np.ndarray) – Scores of rows (copy of scores_).

• scores_col_ (np.ndarray) – Scores of columns.

Example

>>> from sknetwork.ranking import BiDiffusion
>>> from sknetwork.data import star_wars
>>> bidiffusion = BiDiffusion(n_iter=2)
>>> scores = bidiffusion.fit_transform(biadjacency, seeds_row = {0: 1, 2: 0})
>>> np.round(scores, 2)
array([0.5 , 0.5 , 0.46, 0.44])

fit(biadjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray], seeds_row: Optional[Union[numpy.ndarray, dict]] = None, seeds_col: Optional[Union[numpy.ndarray, dict]] = None, init: Optional[float] = None)sknetwork.ranking.diffusion.BiDiffusion[source]

Compute the diffusion (temperatures at equilibrium).

Parameters

• seeds_row – Temperatures of seed rows in initial state (dictionary or vector of size n_row). Negative temperatures ignored.

• seeds_col – Temperatures of seed columns in initial state (dictionary or vector of size n_col). Negative temperatures ignored.

• init – Temperature of non-seed nodes in initial state. If None, use the average temperature of seed nodes (default).

Returns

self

Return type

BiDiffusion

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## Dirichlet¶

class sknetwork.ranking.Dirichlet(n_iter: int = 10, damping_factor: Optional[float] = None, verbose: bool = False)[source]

Ranking by the Dirichlet problem (heat diffusion with boundary constraints).

• Graphs

• Digraphs

Parameters
• n_iter (int) – If positive, number of steps of the diffusion in discrete time. Otherwise, solve the Dirichlet problem by the bi-conjugate gradient stabilized method.

• damping_factor (float (optional)) – Damping factor (default value = 1).

• verbose (bool) – Verbose mode.

Variables

scores_ (np.ndarray) – Score of each node (= temperature).

Example

>>> from sknetwork.ranking import Dirichlet
>>> from sknetwork.data import house
>>> dirichlet = Dirichlet()
>>> seeds = {0: 1, 2: 0}
>>> np.round(scores, 2)
array([1.  , 0.54, 0.  , 0.31, 0.62])


References

Chung, F. (2007). The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray], seeds: Optional[Union[numpy.ndarray, dict]] = None, init: Optional[float] = None)sknetwork.ranking.diffusion.Dirichlet[source]

Compute the solution to the Dirichlet problem (temperatures at equilibrium).

Parameters

• seeds – Temperatures of seed nodes (dictionary or vector). Negative temperatures ignored.

• init – Temperature of non-seed nodes in initial state. If None, use the average temperature of seed nodes (default).

Returns

self

Return type

Dirichlet

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

class sknetwork.ranking.BiDirichlet(n_iter: int = 10, damping_factor: Optional[float] = None, verbose: bool = False)[source]

Ranking by the Dirichlet problem in bipartite graphs (heat diffusion with boundary constraints).

• Bigraphs

Parameters
• n_iter (int) – If positive, number of steps of the diffusion in discrete time. Otherwise, solve the Dirichlet problem by the bi-conjugate gradient stabilized method.

• damping_factor (float (optional)) – Damping factor (default value = 1).

• verbose (bool) – Verbose mode.

Variables
• scores_ (np.ndarray) – Scores of rows.

• scores_row_ (np.ndarray) – Scores of rows (copy of scores_).

• scores_col_ (np.ndarray) – Scores of columns.

Example

>>> from sknetwork.ranking import BiDirichlet
>>> from sknetwork.data import star_wars
>>> bidirichlet = BiDirichlet()
>>> scores = bidirichlet.fit_transform(biadjacency, seeds_row = {0: 1, 2: 0})
>>> np.round(scores, 2)
array([1.  , 0.5 , 0.  , 0.29])

fit(biadjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray], seeds_row: Optional[Union[numpy.ndarray, dict]] = None, seeds_col: Optional[Union[numpy.ndarray, dict]] = None, init: Optional[float] = None)sknetwork.ranking.diffusion.BiDirichlet[source]

Compute the solution to the Dirichlet problem (temperatures at equilibrium).

Parameters

• seeds_row – Temperatures of seed rows (dictionary or vector of size n_row). Negative temperatures ignored.

• seeds_col – Temperatures of seed columns (dictionary or vector of size n_col). Negative temperatures ignored.

• init – Temperature of non-seed nodes in initial state. If None, use the average temperature of seed nodes (default).

Returns

self

Return type

BiDirichlet

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## Katz¶

class sknetwork.ranking.Katz(damping_factor: float = 0.5, path_length: int = 4)[source]

Katz centrality:

$$\sum_{k=1}^K\alpha^k(A^k)^T\mathbf{1}$$.

• Graphs

• Digraphs

Parameters
• damping_factor (float) – Decay parameter for path contributions.

• path_length (int) – Maximum length of the paths to take into account.

Examples

>>> from sknetwork.data.toy_graphs import house
>>> katz = Katz()
>>> np.round(scores, 2)
array([6.5 , 8.25, 5.62, 5.62, 8.25])


References

Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39-43.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray, scipy.sparse.linalg.interface.LinearOperator])sknetwork.ranking.katz.Katz[source]

Katz centrality.

Parameters

Returns

self

Return type

Katz

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

class sknetwork.ranking.BiKatz(damping_factor: float = 0.5, path_length: int = 4)[source]

Katz centrality for bipartite graphs.

• Bigraphs

Parameters
• damping_factor (float) – Decay parameter for path contributions.

• path_length (int) – Maximum length of the paths to take into account.

Examples

>>> from sknetwork.data.toy_graphs import star_wars
>>> bikatz = BiKatz()
>>> np.round(scores, 2)
array([6.38, 3.06, 8.81, 5.75])


References

Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39-43.

fit(biadjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.ranking.katz.BiKatz[source]

Katz centrality.

Parameters

Returns

self

Return type

BiKatz

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## HITS¶

class sknetwork.ranking.HITS(solver: Union[str, sknetwork.linalg.svd_solver.SVDSolver] = 'auto', **kwargs)[source]

Hub and authority scores of each node. For bipartite graphs, the hub score is computed on rows and the authority score on columns.

• Graphs

• Digraphs

• Bigraphs

Parameters
• solver ('auto', 'halko', 'lanczos' or SVDSolver) –

Which singular value solver to use.

• 'auto' call the auto_solver function.

• 'halko': randomized method, fast but less accurate than 'lanczos' for ill-conditioned matrices.

• 'lanczos': power-iteration based method.

• SVDSolver: custom solver.

Variables
• scores_ (np.ndarray) – Hub score of each row.

• scores_row_ (np.ndarray) – Hub score of each row (copy of scores_row_).

• scores_col_ (np.ndarray) – Authority score of each column.

Example

>>> from sknetwork.ranking import HITS
>>> from sknetwork.data import star_wars
>>> hits = HITS()
>>> np.round(scores, 2)
array([0.5 , 0.23, 0.69, 0.46])


References

Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5), 604-632.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.ranking.hits.HITS[source]

Compute HITS algorithm with a spectral method.

Parameters

Returns

self

Return type

HITS

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## Betweenness centrality¶

class sknetwork.ranking.Betweenness(normalized: bool = False)

Betweenness centrality, based on Brandes’ algorithm.

• Graphs

Variables

scores_ (np.ndarray) – Betweenness centrality value of each node

Example

>>> from sknetwork.ranking import Betweenness
>>> from sknetwork.data.toy_graphs import bow_tie
>>> betweenness = Betweenness()
>>> scores
array([4., 0., 0., 0., 0.])


References

Brandes, Ulrik (2001). A faster algorithm for betweenness centrality. Journal of Mathematical Sociology.

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## Closeness centrality¶

class sknetwork.ranking.Closeness(method: str = 'exact', tol: float = 0.1, n_jobs: Optional[int] = None)[source]

Closeness centrality of each node in a connected graph, corresponding to the average length of the shortest paths from that node to all the other ones.

For a directed graph, the closeness centrality is computed in terms of outgoing paths.

• Graphs

• Digraphs

Parameters
• method – Denotes if the results should be exact or approximate.

• tol (float) – If method=='approximate', the allowed tolerance on each score entry.

• n_jobs – If an integer value is given, denotes the number of workers to use (-1 means the maximum number will be used). If None, no parallel computations are made.

Variables

scores_ (np.ndarray) – Closeness centrality of each node.

Example

>>> from sknetwork.ranking import Closeness
>>> from sknetwork.data import cyclic_digraph
>>> closeness = Closeness()
>>> np.round(scores, 2)
array([0.67, 0.67, 0.67])


References

Eppstein, D., & Wang, J. (2001, January). Fast approximation of centrality. In Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms (pp. 228-229). Society for Industrial and Applied Mathematics.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.ranking.closeness.Closeness[source]

Closeness centrality for connected graphs.

Parameters

Returns

self

Return type

Closeness

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## Harmonic centrality¶

class sknetwork.ranking.Harmonic(n_jobs: Optional[int] = None)[source]

Harmonic centrality of each node in a connected graph, corresponding to the average inverse length of the shortest paths from that node to all the other ones.

For a directed graph, the harmonic centrality is computed in terms of outgoing paths.

• Graphs

• Digraphs

Parameters

n_jobs – If an integer value is given, denotes the number of workers to use (-1 means the maximum number will be used). If None, no parallel computations are made.

Variables

scores_ (np.ndarray) – Harmonic centrality of each node.

Example

>>> from sknetwork.ranking import Harmonic
>>> from sknetwork.data import house
>>> harmonic = Harmonic()
>>> np.round(scores, 2)
array([3. , 3.5, 3. , 3. , 3.5])


References

Marchiori, M., & Latora, V. (2000). Harmony in the small-world. Physica A: Statistical Mechanics and its Applications, 285(3-4), 539-546.

fit(adjacency: Union[scipy.sparse.csr.csr_matrix, numpy.ndarray])sknetwork.ranking.harmonic.Harmonic[source]

Harmonic centrality for connected graphs.

Parameters

Returns

self

Return type

Harmonic

fit_transform(*args, **kwargs)numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

scores – Scores.

Return type

np.ndarray

## Post-processing¶

sknetwork.ranking.top_k(scores: numpy.ndarray, k: int = 1)[source]

Index of the k elements with highest value.

Parameters
• scores (np.ndarray) – Array of values.

• k (int) – Number of elements to return.

Examples

>>> scores = np.array([0, 1, 0, 0.5])
>>> top_k(scores, k=2)
array([1, 3])


Notes

This is a basic implementation that sorts the entire array to find its top k elements.