A hierarchical random graph is an ensemble of undirected
graphs with n
vertices. It is defined via a binary tree with n
leaf and n-1
internal vertices, where the
internal vertices are labeled with probabilities.
The probability that two vertices are connected in the random graph
is given by the probability label at their closest common
ancestor.
Please read the following two articles for more about hierarchical random graphs: A. Clauset, C. Moore, and M.E.J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature 453, 98 - 101 (2008); and A. Clauset, C. Moore, and M.E.J. Newman. Structural Inference of Hierarchies in Networks. In E. M. Airoldi et al. (Eds.): ICML 2006 Ws, Lecture Notes in Computer Science 4503, 1-13. Springer-Verlag, Berlin Heidelberg (2007).
igraph contains functions for fitting HRG models to a given network
(igraph_hrg_fit
), for generating networks from a given HRG
ensemble (igraph_hrg_game
, igraph_hrg_sample
), converting
an igraph graph to a HRG and back (igraph_hrg_create
, igraph_hrg_dendrogram
), for calculating a consensus tree from a
set of sampled HRGs (igraph_hrg_consensus
) and for predicting
missing edges in a network based on its HRG models (igraph_hrg_predict
).
The igraph HRG implementation is heavily based on the code published by Aaron Clauset, at his website, http://tuvalu.santafe.edu/~aaronc/hierarchy/
typedef struct igraph_hrg_t { igraph_vector_t left, right, prob, edges, vertices; } igraph_hrg_t;
A hierarchical random graph (HRG) can be given as a binary tree, where the internal vertices are labeled with real numbers.
Note that you don't necessarily have to know this internal representation for using the HRG functions, just pass the HRG objects created by one igraph function, to another igraph function.
It has the following members:
Values:
|
Vector that contains the left children of the internal tree vertices. The first vertex is always the root vertex, so the first element of the vector is the left child of the root vertex. Internal vertices are denoted with negative numbers, starting from -1 and going down, i.e. the root vertex is -1. Leaf vertices are denoted by non-negative number, starting from zero and up. |
|
Vector that contains the right children of the
vertices, with the same encoding as the |
|
The connection probabilities attached to the internal vertices, the first number belongs to the root vertex (i.e. internal vertex -1), the second to internal vertex -2, etc. |
|
The number of edges in the subtree below the given internal vertex. |
|
The number of vertices in the subtree below the given internal vertex, including itself. |
int igraph_hrg_init(igraph_hrg_t *hrg, int n);
This function must be called before passing an igraph_hrg_t
to
an igraph function.
Arguments:
|
Pointer to the HRG data structure to initialize. |
|
The number of vertices in the graph that is modeled by this HRG. It can be zero, if this is not yet known. |
Returns:
Error code. |
Time complexity: O(n), the number of vertices in the graph.
void igraph_hrg_destroy(igraph_hrg_t *hrg);
The HRG data structure can be reinitialized again with an igraph_hrg_destroy
call.
Arguments:
|
Pointer to the HRG data structure to deallocate. |
Time complexity: operating system dependent.
int igraph_hrg_size(const igraph_hrg_t *hrg);
Arguments:
|
Pointer to the HRG. |
Returns:
The number of leaf nodes in the HRG. |
Time complexity: O(1).
int igraph_hrg_resize(igraph_hrg_t *hrg, int newsize);
Arguments:
|
Pointer to an initialized (see |
|
The new size, i.e. the number of leaf nodes. |
Returns:
Error code. |
Time complexity: O(n), n is the new size.
int igraph_hrg_fit(const igraph_t *graph, igraph_hrg_t *hrg, igraph_bool_t start, int steps);
Arguments:
|
The igraph graph to fit the model to. Edge directions are ignored in directed graphs. |
|
Pointer to an initialized HRG, the result of the fitting
is stored here. It can also be used to pass a HRG to the
function, that can be used as the starting point of the Markov
Chain Monte Carlo fitting, if the |
|
Logical, whether to start the fitting from the given HRG. |
|
Integer, the number of MCMC steps to take in the fitting procedure. If this is zero, then the fitting stop is a convergence criteria is fulfilled. |
Returns:
Error code. |
Time complexity: TODO.
int igraph_hrg_consensus(const igraph_t *graph, igraph_vector_t *parents, igraph_vector_t *weights, igraph_hrg_t *hrg, igraph_bool_t start, int num_samples);
The calculation can be started from the given HRG (hrg
), or (if
start
is false), a HRG is first fitted to the given graph.
Arguments:
|
The input graph. |
|
An initialized vector, the results are stored here. For each vertex, the id of its parent vertex is stored, or -1, if the vertex is the root vertex in the tree. The first n vertex ids (from 0) refer to the original vertices of the graph, the other ids refer to vertex groups. |
|
Numeric vector, counts the number of times a given
tree split occured in the generated network samples, for each
internal vertices. The order is the same as in |
|
A hierarchical random graph. It is used as a starting
point for the sampling, if the |
|
Logical, whether to use the supplied HRG (in |
|
The number of samples to generate for creating the consensus tree. |
Returns:
Error code. |
Time complexity: TODO.
int igraph_hrg_sample(const igraph_t *input_graph, igraph_t *sample, igraph_vector_ptr_t *samples, igraph_integer_t no_samples, igraph_hrg_t *hrg, igraph_bool_t start);
Sample from a hierarchical random graph ensemble. The ensemble can
be given as a graph (input_graph
), or as a HRG object (hrg
).
If a graph is given, then first an MCMC optimization is performed
to find the optimal fitting model; then the MCMC is used to sample
the graph(s).
Arguments:
|
An igraph graph, or a null pointer. If not a
null pointer, then a HRG is first fitted to the graph, possibly
starting from the given HRG, if the |
|
Pointer to an uninitialized graph, or a null pointer. If only one sample is requested, and it is not a null pointer, then the sample is stored here. |
|
An initialized vector of pointers. If more than one
samples are requested, then they are stored here. Note that to
free this data structure, you need to call |
|
The number of samples to generate. |
|
A HRG. It is modified during the sampling. |
|
Logical, whether to start the MCMC from the given HRG. |
Returns:
Error code. |
Time complexity: TODO.
int igraph_hrg_game(igraph_t *graph, const igraph_hrg_t *hrg);
This function is a simple shortcut to igraph_hrg_sample
.
It creates a single graph, from the given HRG.
Arguments:
|
Pointer to an uninitialized graph, the new graph is created here. |
|
The hierarchical random graph model to sample from. It is modified during the MCMC process. |
Returns:
Error code. |
Time complexity: TODO.
int igraph_hrg_dendrogram(igraph_t *graph, const igraph_hrg_t *hrg);
Creates the igraph graph equivalent of an igraph_hrg_t
data
structure.
Arguments:
|
Pointer to an uninitialized graph, the result is stored here. |
|
The hierarchical random graph to convert. |
Returns:
Error code. |
Time complexity: O(n), the number of vertices in the graph.
int igraph_hrg_create(igraph_hrg_t *hrg, const igraph_t *graph, const igraph_vector_t *prob);
Arguments:
|
Pointer to an initialized |
|
The igraph graph to convert. It must be a directed binary tree, with n-1 internal and n leaf vertices. The root vertex must have in-degree zero. |
|
The vector of probabilities, this is used to label the internal nodes of the hierarchical random graph. The values corresponding to the leaves are ignored. |
Returns:
Error code. |
Time complexity: O(n), the number of vertices in the tree.
int igraph_hrg_predict(const igraph_t *graph, igraph_vector_t *edges, igraph_vector_t *prob, igraph_hrg_t *hrg, igraph_bool_t start, int num_samples, int num_bins);
Samples HRG models for a network, and estimated the probability that an edge was falsely observed as non-existent in the network.
Arguments:
|
The input graph. |
|
The list of missing edges is stored here, the first two elements are the first edge, the next two the second edge, etc. |
|
Vector of probabilies for the existence of missing
edges, in the order corresponding to |
|
A HRG, it is used as a starting point if |
|
Logical, whether to start the MCMC from the given HRG. |
|
The number of samples to generate. |
|
Controls the resolution of the edge probabilities. Higher numbers result higher resolution. |
Returns:
Error code. |
Time complexity: TODO.
← Chapter 25. Graphlets | Chapter 27. Spectral coarse graining → |