# NetworKit Distance Tutorial

NetworKit provides several graph traversal and pathfinding algorithms within the `distance` module. This notebook covers most of these algorithms, and shows how to use them.

In [None]:
import networkit as nk

For this tutorial we will use the same graph, and the same source and target node. We will indext the edges of the graph because some algorithms require the edges to be indexed.

In [None]:
# Read a graph
G = nk.readGraph("../input/foodweb-baydry.konect", nk.Format.KONECT)
GDir = G
G = nk.graphtools.toUndirected(G)
source = 0
target = 27
G.indexEdges()

## Algebraic Distance

Algebraic distance assigns a distance value to pairs of nodes according to their structural closeness in the graph. Algebraic distances will become small within dense subgraphs.

The [AlgebraicDistance(G, numberSystems=10, numberIterations=30, omega=0.5, norm=0, withEdgeScores=False)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=alg#networkit.distance.AlgebraicDistance) constructor expects a graph followed by the number of systems to use for algebraic iteration and the  number of iterations in each system. `omega` is the overrelaxation parameter while `norm` is the norm factor of the extended algebraic distance. Set `withEdgeScores` to true if the array of scores for edges {u,v} that equal ad(u,v) should be calculated.

In [None]:
# Initialize algorithm
ad = nk.distance.AlgebraicDistance(G, 10, 100, 1, 1, True)

In [None]:
# Run
ad.preprocess()

In [None]:
# The algebraic distance between the source and target node
ad.distance(source, target)

## All-Pairs Shortest-Paths (APSP)

The APSP algorithm computes all pairwise shortest-path distances in a given graph. It is implemented running Dijkstraâ€™s algorithm from each node, or BFS if the graph is unweighted.

The constructor [APSP(G)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=apsp#networkit.distance.APSP) expects a graph.

In [None]:
# Initialize algorithm
apsp = nk.distance.APSP(G)

In [None]:
# Run
apsp.run()

In [None]:
# The distance from source to target node
print(apsp.getDistance(source, target))

## Pruned Landmark Labeling

Pruned Landmark Labeling is an alternative to APSP. It computes distance labels by performing a *pruned* BFS from each node in the graph. Distance labels are then used to quickly compute shortest-path distances between node pairs. This algorithm only works for unweighted graphs.

In [None]:
# Initialize the algorithm - in case of weighted graphs, edge weights are ignored
pll = nk.distance.PrunedLandmarkLabeling(G)

In [None]:
# Run - this step computes the distance labels
pll.run()

In [None]:
# Retrieve the shortest-path distance
print(pll.query(source, target))

## Some-Pairs Shortest-Paths (SPSP)

SPSP is an alternative to APSP, it computes the shortest-path distances from a set of user-specified source nodes to all the other nodes of the graph.

The constructor `SPSP(G, sources` takes as input a graph and a list of source nodes.

In [None]:
# Initialize the algorithm
sources = [0, 1, 2]

spsp = nk.distance.SPSP(G, sources)

# Run
spsp.run()

# Print the distances from the selected sources to the target
for source in sources:
    print("Distance from {:d} to {:d}: {:.3e}".format(source, target, spsp.getDistance(source, target)))

## A*

 A* is an informed search algorithm , as it uses information about path cost and also uses heuristics to find the shortest path.

The  [AStar(G, heu, source, target, storePred=True)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=astar#networkit.distance.AStar) constructor expects a graph, the source and target nodes as mandatory parameters. The algorithm will also store the predecessors and reconstruct a shortest path from the source and the target if `storePred` is true. `heu` is a list of lower bounds of the distance of each node to the target.

As we do not have any prior knowledge about the graph we choose all zeros as a heuristic because zero is always a lower bound of the distance between two nodes. In this case, the A* algorithm is equivalent to Dijkstra.

In [None]:
# Initialize algorithm
heuristic = [0 for _ in range(G.upperNodeIdBound())]
astar = nk.distance.AStar(G, heuristic, source, target)

In [None]:
# Run
astar.run()

In [None]:
# The distance from source to target node
print(astar.getDistance())
# The path from source to target node
print(astar.getPath())

## Breadth-First Search (BFS) 

BFS is an algorithm for traversing a graph which starts from the source node `u`, and explores all of the u's neighbors nodes at the present depth before moving on to the nodes at the next depth level. BFS finds the shortest paths from a source to all the reachable nodes of an unweighted graph.

The [BFS(G, source, storePaths=True, storeNodesSortedByDistance=False, target=none)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=bfs#networkit.distance.BFS) constructor expects a graph and a source node as mandatory parameters. If the paths should be stored, set `storedPaths` to true. If `storeNodesSortedByDistance` is set, a vector of nodes ordered in increasing distance from the source is stored. `target` is the target node.

In [None]:
# Initialize algorithm
bfs = nk.distance.BFS(G, source, True, False, target)

In [None]:
# Run
bfs.run()

In [None]:
# The distance from source to target node
print(bfs.distance(target))
# The number of shortest paths between the source node
print(bfs.numberOfPaths(target))
# Returns a shortest path from source to target
print(bfs.getPath(target))

## Bidirectional BFS

The Bidirectional BFS algorithm explores the graph from both the source and target nodes until the two explorations meet. This version of BFS is more efficient than BFS when the target node is known.

The [BidirectionalBFS(G, source, target, storePred=True)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=bidirec#networkit.distance.BidirectionalBFS) constructor expects a graph, the source and target nodes as mandatory parameters. The algorithm will also store the predecessors and reconstruct a shortest path from the source and the target if `storePred` is true. 

In [None]:
# Initialize algorithm
biBFS = nk.distance.BidirectionalBFS(G, source, target)

In [None]:
# Run
biBFS.run()

Unlike BFS, the `getPath` method does not include the source at the beginning, and the target at the end of the the returned list.

In [None]:
# The distance from source to target node
print(biBFS.getHops())
print(biBFS.getPath())

## Dijkstra 

Dijkstra's algorithm finds the shortest path from a source node a target node. This algorithm creates a tree of shortest paths from the source to all other nodes in the graph. Dijkstra's algorithm finds the shortest paths from a source to all the reachable nodes of a weighted graph.

The [Dijkstra(G, source, storePaths=True, storeNodesSortedByDistance=False, target=none)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=dij#networkit.distance.Dijkstra) constructor expects a graph and a source node as mandatory parameters. If the paths should be stored, set `storedPaths` to true. If `storeNodesSortedByDistance` is set, a vector of nodes ordered in increasing distance from the source is stored. `target` is the target node.

In [None]:
# Initialize algorithm
dijkstra = nk.distance.Dijkstra(G, source, True, False, target)

In [None]:
# Run
dijkstra.run()

In [None]:
# The distance from source to target node
print(dijkstra.distance(target))
# The number of shortest paths between the source node
print(dijkstra.numberOfPaths(target))
# Returns a shortest path from source to target
print(dijkstra.getPath(target))

## Bidirectional Dijkstra

The Bidirectional Dijkstra algorithm explores the graph from both the source and target nodes until the two explorations meet.  This version of Dijkstra is more efficient than the convential Dijkstra when the target node is known.

The [BidirectionalDijkstra(G, source, target, storePred=True)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=bidirec#networkit.distance.BidirectionalDijkstra) constructor expects a graph, the source and target nodes as mandatory parameters. The algorithm will also store the predecessors and reconstruct a shortest path from the source and the target if `storePred` is true.

In [None]:
# Initialize algorithm
biDij = nk.distance.BidirectionalDijkstra(G, source, target)

In [None]:
# Run
biDij.run()

Unlike Dijkstra, the `getPath` method does not include the source at the beginning, and the target at the end of the the returned list.

In [None]:
# The distance from source to target node
print(biDij.getDistance())
# The path from source to target node
print(biDij.getPath())

## Commute Time Distance

This class computes the Euclidean Commute Time Distance between each pair of nodes for an undirected unweighted graph.

The [CommuteTimeDistance(G, tol=0.1)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=commute#networkit.distance.CommuteTimeDistance) constructor expects a graph as a mandatory parameter. The optional parameter `tol` is the tolerance parameter used for approximation.

In [None]:
# Initialize algorithm
ctd = nk.distance.CommuteTimeDistance(G)

In [None]:
# Run
ctd.run()

In [None]:
# The distance from source to target node
print(ctd.distance(source, target))

If one wants to compute the commute time distance between two nodes, then they should use [runSinglePair(u, v)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=runsingle#networkit.distance.CommuteTimeDistance.runSinglePair) method.

In [None]:
ctd.runSinglePair(source,target)

## Diameter 

This algorithm gives an estimation of the diameter of a given graph. The algorithm is based on the ExactSumSweep algorithm presented in Michele Borassi, Pierluigi Crescenzi, Michel Habib, Walter A. Kosters, Andrea Marino, Frank W. Takes: http://www.sciencedirect.com/science/article/pii/S0304397515001644.

The [Diameter(G, algo=DiameterAlgo.AUTOMATIC, error=1.0, nSamples=0)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=diameter#networkit.distance.Diameter) constructor expects a graph as mandatory parameter. `algo` specifies the choice of diameter algorithm while `error` is the maximum allowed relative error. Set to 0 for the exact diameter. `nSamples`is the number of samples to be used. `algo` can be chosen between from 
    0. automatic
    1. exact
    2. estimatedRange
    3. estimatedSamples
    4. estimatedPedantic

Note that the input graph must be connected, otherwise the resulting diameter will be infinite. As the graph we are using is not connected, we shall extract the largest connected component from it and then compute the diameter of the resulting graph.

In [None]:
# Extract largest connect component
newGraph = nk.components.ConnectedComponents.extractLargestConnectedComponent(G, True)
newGraph.numberOfNodes()

In [None]:
# Initialize algorithm to compute the exact diameter of the input graph
diam = nk.distance.Diameter(newGraph,algo=1)

In [None]:
# Run
diam.run()

In [None]:
# Get diameter of graph
diam.getDiameter()

The return value of `getDiameter` is a pair of integers, i.e., the lower bound and upper bound of the diameter. In the case, that we computed the exact diameter, the diameter is the first value of the pair.

## Eccentricity

The eccentricity of a node `u` is defined as the distance to the farthest node from node u. In other words, it is the longest shortest-path starting from node `u`.

The eccentricity of a graph can be computed by calling the [getValue(G, v)]() method, and passing a graph and a node. The method returns the node farthest from v, and the length of the shortest path between `v` and the farthest node.

In [None]:
# Run
nk.distance.Eccentricity.getValue(G, source)

## Effective Diameter

The effective diameter is defined as the number of edges on average to reach a given ratio of all other nodes.

The [EffectiveDiameter(G, ratio=0.9)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=effective#networkit.distance.EffectiveDiameter) constructor expects an undirected graph and the ratio of nodes that should be connected. The ratio must be between in the interval (0,1].

In [None]:
# Initialize algorithm
ed = nk.distance.EffectiveDiameter(G)

In [None]:
# Run
ed.run()

In [None]:
# Get effective diameter
ed.getEffectiveDiameter()

## Effective Diameter Approximation

This class approximates the effective diameter according to the algorithm presented in the "A Fast and Scalable Tool for Data Mining in Massive Graphs" by [Palmer, Gibbons and Faloutsos](http://www.cs.cmu.edu/~christos/PUBLICATIONS/kdd02-anf.pdf).

The [EffectiveDiameter(G, ratio=0.9, k=64, r=7)]() constructor expects an undirected graph, the ratio of nodes that should be connected, the number of parallel approximations `k` to get a more robust results, and the number of bits `r` that should be added to the bitmask. The more bits are added to the bitmask, the higher the accuracy. The ratio must be between in the interval (0,1].

In [None]:
# Initialize algorithm
eda = nk.distance.EffectiveDiameterApproximation(G)

In [None]:
# Run
eda.run()

In [None]:
# Get effective diameter
eda.getEffectiveDiameter()

## Reverse BFS

This class does a reverse breadth-first search (following the incoming edges of a node) on a directed graph from a given source node.

The [ReverseBFS(G, source, storePaths=True, storeNodesSortedByDistance=False, target=none)](https://networkit.github.io/dev-docs/python_api/distance.html?highlight=bfs#networkit.distance.BFS) constructor expects a graph and a source node as mandatory parameters. If the paths should be stored, set `storedPaths` to true. If `storeNodesSortedByDistance` is set, a vector of nodes ordered in increasing distance from the source is stored. `target` is the target node.

In [None]:
# Initialize algorithm
rbfs = nk.distance.ReverseBFS(G, source, True, False, target)

In [None]:
# Run
rbfs.run()

In [None]:
# The distance from source to target node
print(rbfs.distance(target))
# The number of shortest paths between source and target
print(rbfs.numberOfPaths(target))