New algorithm for finding paths

In an earlier post I described an R-script I wrote for finding paths in directed, a-cyclic graphs. That script had some problems, including efficiency problems. I made a completely new script for finding paths in directed, a-cyclic graphs, and replaced the old script with the new one. The new script also appears to find paths that the old algorithm did not identify, so I must have missed something in the old script.

The logic of the new script is quite simple. It contains one function with a number of parameters (these are also described in the script itself). The function and its possible arguments are as follows:

FindPaths(arcs, origin = 0, originsOnly = F, endsOnly, F, writeToDisk = F)

The first argument (arcs) should be an edge list with two columns, where the first column contains the source nodes, and the second column contains the target nodes of arcs (see table below for an example). All nodes should be represented by numbers for the function to work.

Source Target
1 2
1 3
2 4
2 5

The first step in the function is to identify all origins in the list of arcs (nodes with 0 indegree) and all endpoints (nodes with 0 outdegree). The arguments originsOnly or endsOnly can be set to TRUE (T) to have the function return the list of origins or endpoints respectively. No paths are identified if one of these options is chosen.

The user may also choose a specific origin node (origin), which tells the function to only look for paths that flow from that specific node. This is a wise choice when the graph under investigation is very large and/or complex. If origin is left in its default value of 0, then the function will attempt to identify all paths in the graph.

The function first identifies an initial list of paths with only two nodes. Then this list of paths is fed into a while-loop, in which the paths are completed further (also see comments in script for more detailed descriptions).

The user can choose to write the files to the disk immediately, by setting the function argument writeToDisk to TRUE (T). This creates a file called Paths.csv. If such a file already exists, then new paths will be appended to the existing file. This allows you to easily add paths to the file by using the function in steps (choosing a different origin point at each run), without having to manually merge the files later.

If this argument is set to FALSE (F) (default setting), then the function returns a list of all paths.

The new function can be downloaded here.

See the earlier post here.