Regression

Regression algorithms.

The attribute values_ assigns a value to each node of the graph.

Diffusion

class sknetwork.regression.Diffusion(n_iter: int = 3, damping_factor: float = 0.5)[source]

Regression by diffusion along the edges, given the temperatures of some seed nodes (heat equation).

The row vector of tempreatures \(T\) evolves like:

\(T \gets (1-\alpha) T + \alpha PT\)

where \(\alpha\) is the damping factor and \(P\) is the transition matrix of the random walk in the graph.

All values are updated, including those of seed nodes (free diffusion). See Dirichlet for diffusion with boundary constraints.

Parameters:
  • n_iter (int) – Number of iterations of the diffusion (must be positive).

  • damping_factor (float) – Damping factor.

Variables:
  • values (np.ndarray) – Value of each node (= temperature).

  • values_row (np.ndarray) – Values of rows, for bipartite graphs.

  • values_col (np.ndarray) – Values of columns, for bipartite graphs.

Example

>>> from sknetwork.data import house
>>> diffusion = Diffusion(n_iter=1)
>>> adjacency = house()
>>> values = {0: 1, 2: 0}
>>> values_pred = diffusion.fit_predict(adjacency, values)
>>> np.round(values_pred, 1)
array([0.8, 0.5, 0.2, 0.4, 0.6])

References

Chung, F. (2007). The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences.

fit(input_matrix: csr_matrix | ndarray, values: ndarray | dict | None = None, values_row: ndarray | dict | None = None, values_col: ndarray | dict | None = None, init: None | float = None, force_bipartite: bool = False) Diffusion[source]

Compute the diffusion (temperatures at equilibrium).

Parameters:
  • input_matrix – Adjacency matrix or biadjacency matrix of the graph.

  • values – Temperatures of nodes in initial state (dictionary or vector). Negative temperatures ignored.

  • values_row – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • values_col – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • init – Temperature of nodes in initial state. If None, use the average temperature of seed nodes (default).

  • force_bipartite – If True, consider the input matrix as a biadjacency matrix (default = False).

Returns:

self

Return type:

Diffusion

fit_predict(*args, **kwargs) ndarray

Fit algorithm to data and return the values. Same parameters as the fit method.

Returns:

values – Values.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

predict(columns: bool = False) ndarray

Return the values predicted by the algorithm.

Parameters:

columns (bool) – If True, return the prediction for columns.

Returns:

values – Values.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm

Dirichlet

class sknetwork.regression.Dirichlet(n_iter: int = 10)[source]

Regression by the Dirichlet problem (heat diffusion with boundary constraints).

The temperatures of some seed nodes are fixed. The temperatures of other nodes are computed.

Parameters:

n_iter (int) – Number of iterations of the diffusion (must be positive).

Variables:
  • values (np.ndarray) – Value of each node (= temperature).

  • values_row (np.ndarray) – Values of rows, for bipartite graphs.

  • values_col (np.ndarray) – Values of columns, for bipartite graphs.

Example

>>> from sknetwork.regression import Dirichlet
>>> from sknetwork.data import house
>>> dirichlet = Dirichlet()
>>> adjacency = house()
>>> values = {0: 1, 2: 0}
>>> values_pred = dirichlet.fit_predict(adjacency, values)
>>> np.round(values_pred, 2)
array([1.  , 0.54, 0.  , 0.31, 0.62])

References

Chung, F. (2007). The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences.

fit(input_matrix: csr_matrix | ndarray, values: ndarray | dict | None = None, values_row: ndarray | dict | None = None, values_col: ndarray | dict | None = None, init: None | float = None, force_bipartite: bool = False) Dirichlet[source]

Compute the solution to the Dirichlet problem (temperatures at equilibrium).

Parameters:
  • input_matrix – Adjacency matrix or biadjacency matrix of the graph.

  • values – Temperatures of nodes (dictionary or vector). Negative temperatures ignored.

  • values_row – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • values_col – Temperatures of rows and columns for bipartite graphs. Negative temperatures ignored.

  • init – Temperature of nodes in initial state. If None, use the average temperature of seed nodes (default).

  • force_bipartite – If True, consider the input matrix as a biadjacency matrix (default = False).

Returns:

self

Return type:

Dirichlet

fit_predict(*args, **kwargs) ndarray

Fit algorithm to data and return the values. Same parameters as the fit method.

Returns:

values – Values.

Return type:

np.ndarray

get_params()

Get parameters as dictionary.

Returns:

params – Parameters of the algorithm.

Return type:

dict

predict(columns: bool = False) ndarray

Return the values predicted by the algorithm.

Parameters:

columns (bool) – If True, return the prediction for columns.

Returns:

values – Values.

Return type:

np.ndarray

set_params(params: dict) Algorithm

Set parameters of the algorithm.

Parameters:

params (dict) – Parameters of the algorithm.

Returns:

self

Return type:

Algorithm