{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# Load your data"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"In scikit-network, a graph is represented by its [adjacency matrix](https://en.wikipedia.org/wiki/Adjacency_matrix) (or biadjacency matrix for a bipartite graph) in the [Compressed Sparse Row](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html) format of SciPy.\n",
"\n",
"In this tutorial, we present a few methods to instantiate a graph in this format."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"from IPython.display import SVG\n",
"\n",
"import numpy as np\n",
"from scipy import sparse\n",
"import pandas as pd\n",
"\n",
"from sknetwork.data import from_edge_list, from_adjacency_list, from_graphml, from_csv\n",
"from sknetwork.visualization import visualize_graph, visualize_bigraph"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## From a NumPy array\n",
"For small graphs, you can instantiate the adjacency matrix as a dense NumPy array and convert it into a sparse matrix in CSR format."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"image/svg+xml": [
""
],
"text/plain": [
""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"adjacency = np.array([[0, 1, 1, 0], [1, 0, 1, 1], [1, 1, 0, 0], [0, 1, 0, 0]])\n",
"adjacency = sparse.csr_matrix(adjacency)\n",
"\n",
"image = visualize_graph(adjacency)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## From an edge list\n",
"Another natural way to build a graph is from a list of edges."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"image/svg+xml": [
""
],
"text/plain": [
""
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"edge_list = [(0, 1), (1, 2), (2, 3), (3, 0), (0, 2)]\n",
"adjacency = from_edge_list(edge_list)\n",
"\n",
"image = visualize_graph(adjacency)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"By default, the graph is undirected, but you can easily make it directed."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"image/svg+xml": [
""
],
"text/plain": [
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"adjacency = from_edge_list(edge_list, directed=True)\n",
"\n",
"image = visualize_graph(adjacency)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"You might also want to add weights to your edges. Just use triplets instead of pairs!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list = [(0, 1, 1), (1, 2, 0.5), (2, 3, 1), (3, 0, 0.5), (0, 2, 2)]\n",
"adjacency = from_edge_list(edge_list)\n",
"\n",
"image = visualize_graph(adjacency)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"You can instantiate a bipartite graph as well."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list = [(0, 0), (1, 0), (1, 1), (2, 1)]\n",
"biadjacency = from_edge_list(edge_list, bipartite=True)\n",
"\n",
"image = visualize_bigraph(biadjacency)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"If nodes are not indexed, you get an object of type ``Bunch`` with graph attributes (node names)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list = [(\"Alice\", \"Bob\"), (\"Bob\", \"Carey\"), (\"Alice\", \"David\"), (\"Carey\", \"David\"), (\"Bob\", \"David\")]\n",
"graph = from_edge_list(edge_list)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_graph(adjacency, names=names)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"By default, the weight of each edge is the number of occurrences of the corresponding link:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list_new = edge_list + [(\"Alice\", \"Bob\"), (\"Alice\", \"David\"), (\"Alice\", \"Bob\")]\n",
"graph = from_edge_list(edge_list_new)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_graph(adjacency, names=names)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"You can make the graph unweighted."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph = from_edge_list(edge_list_new, weighted=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_graph(adjacency, names=names)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"Again, you can make the graph directed:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph = from_edge_list(edge_list, directed=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_graph(adjacency, names=names)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"The graph can also have explicit weights:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list = [(\"Alice\", \"Bob\", 3), (\"Bob\", \"Carey\", 2), (\"Alice\", \"David\", 1), (\"Carey\", \"David\", 2), (\"Bob\", \"David\", 3)]\n",
"graph = from_edge_list(edge_list)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_graph(adjacency, names=names, display_edge_weight=True, display_node_weight=True)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"For a bipartite graph:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list = [(\"Alice\", \"Football\"), (\"Bob\", \"Tennis\"), (\"David\", \"Football\"), (\"Carey\", \"Tennis\"), (\"Carey\", \"Football\")]\n",
"graph = from_edge_list(edge_list, bipartite=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"biadjacency = graph.biadjacency\n",
"names = graph.names\n",
"names_col = graph.names_col"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_bigraph(biadjacency, names_row=names, names_col=names_col)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## From an adjacency list\n",
"\n",
"You can also load a graph from an adjacency list, given as a list of lists or a dictionary of lists:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency_list =[[0, 1, 2], [2, 3]]\n",
"adjacency = from_adjacency_list(adjacency_list, directed=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_graph(adjacency)\n",
"SVG(image)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency_dict = {\"Alice\": [\"Bob\", \"David\"], \"Bob\": [\"Carey\", \"David\"]}\n",
"graph = from_adjacency_list(adjacency_dict, directed=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"image = visualize_graph(adjacency, names=names)\n",
"SVG(image)"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## From a dataframe"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"Your dataframe might consist of a list of edges."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"df = pd.read_csv('miserables.tsv', sep='\\t', names=['character_1', 'character_2'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list = list(df.itertuples(index=False))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph = from_edge_list(edge_list)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"df = pd.read_csv('movie_actor.tsv', sep='\\t', names=['movie', 'actor'])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"edge_list = list(df.itertuples(index=False))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph = from_edge_list(edge_list, bipartite=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"For categorical data, you can use ``pandas`` to get a bipartite graph between samples and features. We show an example taken from the [Adult Income](https://archive.ics.uci.edu/ml/datasets/adult) dataset."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"df = pd.read_csv('adult-income.csv')"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" age | \n",
" workclass | \n",
" occupation | \n",
" relationship | \n",
" gender | \n",
" income | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 40-49 | \n",
" State-gov | \n",
" Adm-clerical | \n",
" Not-in-family | \n",
" Male | \n",
" <=50K | \n",
"
\n",
" \n",
" | 1 | \n",
" 50-59 | \n",
" Self-emp-not-inc | \n",
" Exec-managerial | \n",
" Husband | \n",
" Male | \n",
" <=50K | \n",
"
\n",
" \n",
" | 2 | \n",
" 40-49 | \n",
" Private | \n",
" Handlers-cleaners | \n",
" Not-in-family | \n",
" Male | \n",
" <=50K | \n",
"
\n",
" \n",
" | 3 | \n",
" 50-59 | \n",
" Private | \n",
" Handlers-cleaners | \n",
" Husband | \n",
" Male | \n",
" <=50K | \n",
"
\n",
" \n",
" | 4 | \n",
" 30-39 | \n",
" Private | \n",
" Prof-specialty | \n",
" Wife | \n",
" Female | \n",
" <=50K | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" age workclass occupation relationship gender \\\n",
"0 40-49 State-gov Adm-clerical Not-in-family Male \n",
"1 50-59 Self-emp-not-inc Exec-managerial Husband Male \n",
"2 40-49 Private Handlers-cleaners Not-in-family Male \n",
"3 50-59 Private Handlers-cleaners Husband Male \n",
"4 30-39 Private Prof-specialty Wife Female \n",
"\n",
" income \n",
"0 <=50K \n",
"1 <=50K \n",
"2 <=50K \n",
"3 <=50K \n",
"4 <=50K "
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"df_binary = pd.get_dummies(df, sparse=True)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" age_20-29 | \n",
" age_30-39 | \n",
" age_40-49 | \n",
" age_50-59 | \n",
" age_60-69 | \n",
" age_70-79 | \n",
" age_80-89 | \n",
" age_90-99 | \n",
" workclass_ ? | \n",
" workclass_ Federal-gov | \n",
" ... | \n",
" relationship_ Husband | \n",
" relationship_ Not-in-family | \n",
" relationship_ Other-relative | \n",
" relationship_ Own-child | \n",
" relationship_ Unmarried | \n",
" relationship_ Wife | \n",
" gender_ Female | \n",
" gender_ Male | \n",
" income_ <=50K | \n",
" income_ >50K | \n",
"
\n",
" \n",
" \n",
" \n",
" | 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ... | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 1 | \n",
" 0 | \n",
"
\n",
" \n",
" | 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ... | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 1 | \n",
" 0 | \n",
"
\n",
" \n",
" | 2 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ... | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 1 | \n",
" 0 | \n",
"
\n",
" \n",
" | 3 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ... | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 1 | \n",
" 0 | \n",
"
\n",
" \n",
" | 4 | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" ... | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 1 | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
"
\n",
" \n",
"
\n",
"
5 rows × 42 columns
\n",
"
"
],
"text/plain": [
" age_20-29 age_30-39 age_40-49 age_50-59 age_60-69 age_70-79 \\\n",
"0 0 0 1 0 0 0 \n",
"1 0 0 0 1 0 0 \n",
"2 0 0 1 0 0 0 \n",
"3 0 0 0 1 0 0 \n",
"4 0 1 0 0 0 0 \n",
"\n",
" age_80-89 age_90-99 workclass_ ? workclass_ Federal-gov ... \\\n",
"0 0 0 0 0 ... \n",
"1 0 0 0 0 ... \n",
"2 0 0 0 0 ... \n",
"3 0 0 0 0 ... \n",
"4 0 0 0 0 ... \n",
"\n",
" relationship_ Husband relationship_ Not-in-family \\\n",
"0 0 1 \n",
"1 1 0 \n",
"2 0 1 \n",
"3 1 0 \n",
"4 0 0 \n",
"\n",
" relationship_ Other-relative relationship_ Own-child \\\n",
"0 0 0 \n",
"1 0 0 \n",
"2 0 0 \n",
"3 0 0 \n",
"4 0 0 \n",
"\n",
" relationship_ Unmarried relationship_ Wife gender_ Female gender_ Male \\\n",
"0 0 0 0 1 \n",
"1 0 0 0 1 \n",
"2 0 0 0 1 \n",
"3 0 0 0 1 \n",
"4 0 1 1 0 \n",
"\n",
" income_ <=50K income_ >50K \n",
"0 1 0 \n",
"1 1 0 \n",
"2 1 0 \n",
"3 1 0 \n",
"4 1 0 \n",
"\n",
"[5 rows x 42 columns]"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df_binary.head()"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"biadjacency = df_binary.sparse.to_coo()"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"biadjacency = sparse.csr_matrix(biadjacency)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<32561x42 sparse matrix of type ''\n",
"\twith 195366 stored elements in Compressed Sparse Row format>"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# biadjacency matrix of the bipartite graph\n",
"biadjacency"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"# names of columns\n",
"names_col = list(df_binary)"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/plain": [
"42"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(names_col)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [
{
"data": {
"text/plain": [
"['age_20-29',\n",
" 'age_30-39',\n",
" 'age_40-49',\n",
" 'age_50-59',\n",
" 'age_60-69',\n",
" 'age_70-79',\n",
" 'age_80-89',\n",
" 'age_90-99']"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"names_col[:8]"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## From a CSV file"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"You can directly load a graph from a CSV or TSV file:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph = from_csv('miserables.tsv')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph = from_csv('movie_actor.tsv', bipartite=True)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"The graph can also be given in the form of adjacency lists (check the function ``from_csv``)."
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## From a GraphML file\n",
"\n",
"You can also load a graph stored in the [GraphML](https://en.wikipedia.org/wiki/GraphML) format."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"graph = from_graphml('miserables.graphml')\n",
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"# Directed graph\n",
"graph = from_graphml('painters.graphml')\n",
"adjacency = graph.adjacency\n",
"names = graph.names"
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## From NetworkX\n",
"\n",
"NetworkX has [import](https://networkx.github.io/documentation/stable/reference/generated/networkx.convert_matrix.from_scipy_sparse_matrix.html#networkx.convert_matrix.from_scipy_sparse_matrix) and [export](https://networkx.github.io/documentation/stable/reference/generated/networkx.convert_matrix.to_scipy_sparse_matrix.html#networkx.convert_matrix.to_scipy_sparse_matrix) functions from and towards the CSR format."
]
},
{
"cell_type": "markdown",
"metadata": {
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## Other options\n",
"\n",
"* You want to test our toy graphs\n",
"* You want to generate a graph from a model\n",
"* You want to load a graph from existing repositories (see [NetSet](http://netset.telecom-paris.fr/) and [KONECT](http://konect.cc))\n",
"\n",
"Take a look at the other tutorials of the **data** section!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}