Skip to content

Shortest paths to/from nodes of a certain type

September 14, 2011

Elijah asked the following via SOCNET mailing list:

I was wondering if anyone knew of a script or tool which would give me the network distance of nodes to a particular class of nodes.  I think of this as an Erdos number, except instead of getting the distance to one node, I want the distance to the closest node of a particular class.  Let’s say I have a network of people and I know their professions.  Some are Students, some are Journalists and a small number are Engineers.  I’d like to be able to find out the network distance of each node to the closest Engineer node. It would be particularly useful if the script also had the option to total edge weight into the calculation.

If you get your network data into R it is fairly straightforward to do this using igraph package. Here is the function:

# shortest paths to nodes with a specified value on certain node attribute
spnt <- function(g, aname, avalue, weights=NULL, ...)
{
  require(igraph)
  stopifnot(inherits(g, "igraph"))
  a <- get.vertex.attribute(g, aname)
  m <- shortest.paths(g, v=V(g)[a==avalue], weights=weights, ...)
  apply(m, 2, min)
}

It assumes that ‘g’ is a network (object of class ‘igraph’), ‘aname’ is a name of the node attribute, ‘avalue’ is the value of the attribute ‘aname’ that designates the nodes to/from which we would like to calculate distances, finally ‘weights’ can be optionally used to include weights in the calculation (as a numeric vector).

The function will return a vector of distances in ‘g’ from all the nodes to the closest node that have a value ‘avalue’ on attribute ‘aname’.

As an example consider the network below. It is undirected and has 15 nodes. It has two attributes defined: a node attribute called “color” having values “orange”, “lightblue”, and “lightgreen”, and an edge attribute called “w” with values 1 or 2. Both attributes are shown in the picture as a node color and edge label. The numbers on the nodes are node ids.

Assuming that the network is called ‘g’ we can use the function above in the following way:

> # from lightblue nodes to all others
> spnt(g, "color", "lightblue")
 [1] 0 1 2 1 2 3 0 1 2 1 2 3 2 3 4

> # from orange nodes to all others
> spnt(g, "color", "orange")
 [1] 1 0 1 0 1 0 1 0 0 2 1 0 2 1 0

> # to lightblue, but using weights (shortest path = minimal weight)
> spnt(g, "color", "lightblue", weights=E(g)$w)
 [1] 0 2 3 1 2 3 0 2 4 2 3 4 3 5 5

A couple of end notes:

  • In the result vector you will get 0s for the nodes of specified type, i.e. in the last example there are 0s for the “lightblue” nodes.
  • If a certain node is not connected (directly or via other nodes) to any node of specified type the vector will contain ‘Inf’ (plus infinity).
  • The algorithm will not accept negative weights. But this limitation can be effectively dodged by transforming the weights so that they are all positive (for example adding some number), performing the computation, and then transforming back the results to the original scale.
  • You can exploit other features of ‘shortest.paths’ function, on which this function is based. Any extra arguments to ‘spnt’ are passed to ‘shortest.paths’. For example, if the network is directed you can calculate shortest paths that are either incoming, or outgoing (via ‘mode’ argument). See help page of ‘shortest.paths’.
2 Comments
  1. September 16, 2011 15:58

    Hi Michał,

    Thanks for doing this. When I saw the question on Socnet, I thought about doing something similar. Do note that the weights are often strengths and thus need to be inversed for use with Dijkstra’s algorithm. See http://toreopsahl.com/tnet/weighted-networks/shortest-paths/

    Best,
    Tore

  2. September 18, 2011 22:40

    Tore, thanks for the comment. How the weights are to be used obviously depends on the particular context. if they are something like “social distance” then they can be used as is, while when they signify relationship strength then indeed one can, as you wrote, inverse them (i.e., 1/x) or reverse them (i.e.: -x + max(x)).

    Thanks for the link. Interesting note you wrote there.

    ~michał

Comments are closed.

%d bloggers like this: