Chapter 28. Embedding of graphs

1. Spectral embedding

1. Spectral embedding

1.1. igraph_adjacency_spectral_embedding — Adjacency spectral embedding

int igraph_adjacency_spectral_embedding(const igraph_t *graph,
                                        igraph_integer_t n,
                                        const igraph_vector_t *weights,
                                        igraph_eigen_which_position_t which,
                                        igraph_bool_t scaled,
                                        igraph_matrix_t *X,
                                        igraph_matrix_t *Y,
                                        igraph_vector_t *D,
                                        const igraph_vector_t *cvec,
                                        igraph_arpack_options_t *options);

Spectral decomposition of the adjacency matrices of graphs. This function computes an n-dimensional Euclidean representation of the graph based on its adjacency matrix, A. This representation is computed via the singular value decomposition of the adjacency matrix, A=U D V^T. In the case, where the graph is a random dot product graph generated using latent position vectors in R^n for each vertex, the embedding will provide an estimate of these latent vectors.

For undirected graphs, the latent positions are calculated as X = U^n D^(1/2) where U^n equals to the first no columns of U, and D^(1/2) is a diagonal matrix containing the square root of the selected singular values on the diagonal.

For directed graphs, the embedding is defined as the pair X = U^n D^(1/2), Y = V^n D^(1/2). (For undirected graphs U=V, so it is sufficient to keep one of them.)

Arguments: 

graph:

The input graph, can be directed or undirected.

n:

An integer scalar. This value is the embedding dimension of the spectral embedding. Should be smaller than the number of vertices. The largest n-dimensional non-zero singular values are used for the spectral embedding.

weights:

Optional edge weights. Supply a null pointer for unweighted graphs.

which:

Which eigenvalues (or singular values, for directed graphs) to use, possible values:

IGRAPH_EIGEN_LM

the ones with the largest magnitude

IGRAPH_EIGEN_LA

the (algebraic) largest ones

IGRAPH_EIGEN_SA

the (algebraic) smallest ones.

For directed graphs, IGRAPH_EIGEN_LM and IGRAPH_EIGEN_LA are the same because singular values are used for the ordering instead of eigenvalues.

scaled:

Whether to return X and Y (if scaled is true), or U and V.

X:

Initialized matrix, the estimated latent positions are stored here.

Y:

Initialized matrix or a null pointer. If not a null pointer, then the second half of the latent positions are stored here. (For undirected graphs, this always equals X.)

D:

Initialized vector or a null pointer. If not a null pointer, then the eigenvalues (for undirected graphs) or the singular values (for directed graphs) are stored here.

cvec:

A numeric vector, its length is the number vertices in the graph. This vector is added to the diagonal of the adjacency matrix, before performing the SVD.

options:

Options to ARPACK. See igraph_arpack_options_t for details. Note that the function overwrites the n (number of vertices), nev and which parameters and it always starts the calculation from a random start vector.

Returns: 

Error code.

1.2. igraph_laplacian_spectral_embedding — Spectral embedding of the Laplacian of a graph

int igraph_laplacian_spectral_embedding(const igraph_t *graph,
                                        igraph_integer_t n,
                                        const igraph_vector_t *weights,
                                        igraph_eigen_which_position_t which,
                                        igraph_laplacian_spectral_embedding_type_t type,
                                        igraph_bool_t scaled,
                                        igraph_matrix_t *X,
                                        igraph_matrix_t *Y,
                                        igraph_vector_t *D,
                                        igraph_arpack_options_t *options);

This function essentially does the same as igraph_adjacency_spectral_embedding, but works on the Laplacian of the graph, instead of the adjacency matrix.

Arguments: 

graph:

The input graph.

n:

The number of eigenvectors (or singular vectors if the graph is directed) to use for the embedding.

weights:

Optional edge weights. Supply a null pointer for unweighted graphs.

which:

Which eigenvalues (or singular values, for directed graphs) to use, possible values:

IGRAPH_EIGEN_LM

the ones with the largest magnitude

IGRAPH_EIGEN_LA

the (algebraic) largest ones

IGRAPH_EIGEN_SA

the (algebraic) smallest ones.

For directed graphs, IGRAPH_EIGEN_LM and IGRAPH_EIGEN_LA are the same because singular values are used for the ordering instead of eigenvalues.

type:

The type of the Laplacian to use. Various definitions exist for the Laplacian of a graph, and one can choose between them with this argument. Possible values:

IGRAPH_EMBEDDING_D_A

means D - A where D is the degree matrix and A is the adjacency matrix

IGRAPH_EMBEDDING_DAD

means Di times A times Di, where Di is the inverse of the square root of the degree matrix;

IGRAPH_EMBEDDING_I_DAD

means I - Di A Di, where I is the identity matrix.

scaled:

Whether to return X and Y (if scaled is true), or U and V.

X:

Initialized matrix, the estimated latent positions are stored here.

Y:

Initialized matrix or a null pointer. If not a null pointer, then the second half of the latent positions are stored here. (For undirected graphs, this always equals X.)

D:

Initialized vector or a null pointer. If not a null pointer, then the eigenvalues (for undirected graphs) or the singular values (for directed graphs) are stored here.

options:

Options to ARPACK. See igraph_arpack_options_t for details. Note that the function overwrites the n (number of vertices), nev and which parameters and it always starts the calculation from a random start vector.

Returns: 

Error code.

See also: 

igraph_adjacency_spectral_embedding to embed the adjacency matrix.

1.3. igraph_dim_select — Dimensionality selection

int igraph_dim_select(const igraph_vector_t *sv, igraph_integer_t *dim);

Dimensionality selection for singular values using profile likelihood.

The input of the function is a numeric vector which contains the measure of "importance" for each dimension.

For spectral embedding, these are the singular values of the adjacency matrix. The singular values are assumed to be generated from a Gaussian mixture distribution with two components that have different means and same variance. The dimensionality d is chosen to maximize the likelihood when the d largest singular values are assigned to one component of the mixture and the rest of the singular values assigned to the other component.

This function can also be used for the general separation problem, where we assume that the left and the right of the vector are coming from two normal distributions, with different means, and we want to know their border.

Arguments: 

sv:

A numeric vector, the ordered singular values.

dim:

The result is stored here.

Returns: 

Error code.

Time complexity: O(n), n is the number of values in sv.

See also: