Regression

Regression algorithms.

The attribute values_ assigns a value to each node of the graph.

Diffusion

class sknetwork.regression.Diffusion(n_iter: int = 3)[source]

Regression by diffusion along the edges, given the temperatures of some seed nodes (heat equation).

All values are updated, including those of seed nodes (free diffusion). See Dirichlet for diffusion with boundary constraints.

Parameters

n_iter (int) – Number of iterations of the diffusion (must be positive).

Variables
  • values_ (np.ndarray) – Value of each node (= temperature).

  • values_row_ (np.ndarray) – Values of rows, for bipartite graphs.

  • values_col_ (np.ndarray) – Values of columns, for bipartite graphs.

Example

>>> from sknetwork.data import house
>>> diffusion = Diffusion(n_iter=2)
>>> adjacency = house()
>>> seeds = {0: 1, 2: 0}
>>> values = diffusion.fit_predict(adjacency, seeds)
>>> np.round(values, 2)
array([0.58, 0.56, 0.38, 0.58, 0.42])

References

Chung, F. (2007). The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences.

fit(input_matrix: Union[scipy.sparse._csr.csr_matrix, numpy.ndarray], seeds: Optional[Union[numpy.ndarray, dict]] = None, seeds_row: Optional[Union[numpy.ndarray, dict]] = None, seeds_col: Optional[Union[numpy.ndarray, dict]] = None, init: Union[None, float] = None, force_bipartite: bool = False) sknetwork.regression.diffusion.Diffusion[source]

Compute the diffusion (temperatures at equilibrium).

Parameters
  • input_matrix – Adjacency matrix or biadjacency matrix of the graph.

  • seeds – Temperatures of seed nodes in initial state (dictionary or vector). Negative temperatures ignored.

  • seeds_row – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • seeds_col – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • init – Temperature of non-seed nodes in initial state. If None, use the average temperature of seed nodes (default).

  • force_bipartite – If True, consider the input matrix as a biadjacency matrix (default = False).

Returns

self

Return type

Diffusion

fit_predict(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

values – Values.

Return type

np.ndarray

fit_transform(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Alias for fit_transform. Same parameters as the fit method.

Returns

values – Values.

Return type

np.ndarray

Dirichlet

class sknetwork.regression.Dirichlet(n_iter: int = 10)[source]
Regression by the Dirichlet problem, given the temperature of some seed nodes

(heat diffusion with boundary constraints).

Only values of non-seed nodes are updated. The temperatures of seed nodes are fixed.

Parameters

n_iter (int) – Number of iterations of the diffusion (must be positive).

Variables
  • values_ (np.ndarray) – Value of each node (= temperature).

  • values_row_ (np.ndarray) – Values of rows, for bipartite graphs.

  • values_col_ (np.ndarray) – Values of columns, for bipartite graphs.

Example

>>> from sknetwork.regression import Dirichlet
>>> from sknetwork.data import house
>>> dirichlet = Dirichlet()
>>> adjacency = house()
>>> seeds = {0: 1, 2: 0}
>>> values = dirichlet.fit_predict(adjacency, seeds)
>>> np.round(values, 2)
array([1.  , 0.54, 0.  , 0.31, 0.62])

References

Chung, F. (2007). The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences.

fit(input_matrix: Union[scipy.sparse._csr.csr_matrix, numpy.ndarray], seeds: Optional[Union[numpy.ndarray, dict]] = None, seeds_row: Optional[Union[numpy.ndarray, dict]] = None, seeds_col: Optional[Union[numpy.ndarray, dict]] = None, init: Union[None, float] = None, force_bipartite: bool = False) sknetwork.regression.diffusion.Dirichlet[source]

Compute the solution to the Dirichlet problem (temperatures at equilibrium).

Parameters
  • input_matrix – Adjacency matrix or biadjacency matrix of the graph.

  • seeds – Temperatures of seed nodes (dictionary or vector). Negative temperatures ignored.

  • seeds_row – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • seeds_col – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • init – Temperature of non-seed nodes in initial state. If None, use the average temperature of seed nodes (default).

  • force_bipartite – If True, consider the input matrix as a biadjacency matrix (default = False).

Returns

self

Return type

Dirichlet

fit_predict(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Same parameters as the fit method.

Returns

values – Values.

Return type

np.ndarray

fit_transform(*args, **kwargs) numpy.ndarray

Fit algorithm to data and return the scores. Alias for fit_transform. Same parameters as the fit method.

Returns

values – Values.

Return type

np.ndarray