\documentclass[12pt]{article} \batchmode \newcommand{\m}[1]{{\bf{#1}}} % for matrices and vectors \newcommand{\tr}{^{\sf T}} % transpose \topmargin 0in \textheight 9in \oddsidemargin 0pt \evensidemargin 0pt \textwidth 6.5in %------------------------------------------------------------------------------- \begin{document} %------------------------------------------------------------------------------- \title{User Guide for LDL, a concise sparse Cholesky package} \author{Timothy A. Davis\thanks{ email: DrTimothyAldenDavis@gmail.com, http://www.suitesparse.com. This work was supported by the National Science Foundation, under grant CCR-0203270. Portions of the work were done while on sabbatical at Stanford University and Lawrence Berkeley National Laboratory (with funding from Stanford University and the SciDAC program). }} \input{ldl_version.tex} \maketitle %------------------------------------------------------------------------------- \begin{abstract} The {\tt LDL} software package is a set of short, concise routines for factorizing symmetric positive-definite sparse matrices, with some applicability to symmetric indefinite matrices. Its primary purpose is to illustrate much of the basic theory of sparse matrix algorithms in as concise a code as possible, including an elegant method of sparse symmetric factorization that computes the factorization row-by-row but stores it column-by-column. The entire symbolic and numeric factorization consists of less than 50 lines of code. The package is written in C, and includes a MATLAB interface. \end{abstract} %------------------------------------------------------------------------------- %------------------------------------------------------------------------------- \section{Overview} %------------------------------------------------------------------------------- {\tt LDL} is a set of short, concise routines that compute the $\m{LDL}\tr$ factorization of a sparse symmetric matrix $\m{A}$. Its primary purpose is to illustrate much of the basic theory of sparse matrix algorithms in as compact a code as possible, including an elegant method of sparse symmetric factorization (related to \cite{Liu86c,Liu91}). The lower triangular factor $\m{L}$ is computed row-by-row, in contrast to the conventional column-by-column method. Although it does not achieve the same level of performance as methods based on dense matrix kernels (such as \cite{NgPeyton93,RothbergGupta91}), its performance is competitive with column-by-column methods that do not use dense kernels \cite{GeorgeLiu79, GeorgeLiu, GilbertMolerSchreiber}. Section~\ref{Algorithm} gives a brief description of the algorithm used in the symbolic and numeric factorization. A more detailed tutorial-level discussion may be found in \cite{Stewart03}. Details of the concise implementation of this method are given in Section~\ref{Implementation}. Sections~\ref{MATLAB}~and~\ref{C} give an overview of how to use the package in MATLAB and in a stand-alone C program. %------------------------------------------------------------------------------- \section{Algorithm} \label{Algorithm} %------------------------------------------------------------------------------- The underlying numerical algorithm is described below. The $k$th step solves a lower triangular system of dimension $k-1$ to compute the $k$th row of $\m{L}$ and the $d_{kk}$ entry of the diagonal matrix $\m{D}$. Colon notation is used for submatrices. For example, $\m{L}_{k,1:k-1}$ refers to the first $k-1$ columns of the $k$th row of $\m{L}$. Similarly, $\m{L}_{1:k-1,1:k-1}$ refers to the leading $(k-1)$-by-$(k-1)$ submatrix of $\m{L}$. %--------------- \vspace{-0.2in} \begin{tabbing} \hspace{2em} \= \hspace{2em} \= \hspace{2em} \= \\ {\bf Algorithm~1 ($\m{LDL}\tr$ factorization of a $n$-by-$n$ symmetric matrix $\m{A}$)} \\ \> {\bf for} $k = 1$ {\bf to} $n$ \\ \>\> (step 1) Solve $\m{L}_{1:k-1,1:k-1}\m{y} = \m{A}_{1:k-1,k}$ for $\m{y}$ \\ \>\> (step 2) $\m{L}_{k,1:k-1} = (\m{D}_{1:k-1,1:k-1}^{-1} \m{y})\tr$ \\ \>\> (step 3) $l_{kk} = 1$ \\ \>\> (step 4) $d_{kk} = a_{kk} - \m{L}_{k,1:k-1}\m{y}$ \\ \> {\bf end for} \end{tabbing} %--------------- The algorithm computes an $\m{LDL}\tr$ factorization without numerical pivoting. It can thus factorize any symmetric positive definite matrix, and any symmetric indefinite matrix whose leading minors are all well-conditioned. When $\m{A}$ and $\m{L}$ are sparse, step 1 of Algorithm~1 requires a triangular solve of the form $\m{Lx}=\m{b}$, where all three terms in the equation are sparse. This is the most costly step of the Algorithm. Steps 2 through 4 are fairly straightforward. Let ${\cal X}$ and ${\cal B}$ refer to the set of indices of nonzero entries in $\m{x}$ and $\m{b}$, respectively, in the lower triangular system $\m{Lx}=\m{b}$. To compute $\m{x}$ efficiently the nonzero pattern ${\cal X}$ must be found first. In the general case when $\m{L}$ is arbitrary \cite{GilbertPeierls88}, the nonzero pattern ${\cal X}$ is the set of nodes reachable via paths in the graph $G_L$ from all nodes in the set ${\cal B}$, and where the graph $G_L$ has $n$ nodes and a directed edge $(j,i)$ if and only if $l_{ij}$ is nonzero. To compute the numerical solution to $\m{Lx}=\m{b}$ by accessing the columns of $\m{L}$ one at a time, ${\cal X}$ can be traversed in any topological order of the subgraph of $G_L$ consisting of nodes in ${\cal X}$. That is, $x_j$ must be computed before $x_i$ if there is a path from $j$ to $i$ in $G_L$. The natural order ($1, 2, \ldots, n$) is one such ordering, but that requires a costly sort of ${\cal X}$. With a graph traversal and topological sort, the solution of $\m{Lx}=\m{b}$ can be computed using Algorithm~2 below. The computation of ${\cal X}$ and $\m{x}$ both take time proportional to the floating-point operation count. %--------------- \vspace{-0.2in} \begin{tabbing} \hspace{2em} \= \hspace{2em} \= \hspace{2em} \= \\ {\bf Algorithm~2 (Solve $\m{Lx}=\m{b}$, where $\m{L}$ is lower triangular with unit diagonal)} \\ \> ${\cal X} = \mbox{Reach}_{G_L} ({\cal B})$ \\ \> $\m{x} = \m{b}$ \\ \> {\bf for} $i \in {\cal X}$ in any topological order \\ \>\> $\m{x}_{i+1:n} = \m{x}_{i+1:n} - \m{L}_{i+1:n,i} x_i$ \\ \> {\bf end for} \end{tabbing} %--------------- The general result also governs the pattern of $\m{y}$ in Algorithm~1. However, in this case $\m{L}$ arises from a sparse Cholesky factorization, and is governed by the elimination tree \cite{Liu90a}. A general graph traversal is not required. In the elimination tree, the parent of node $i$ is the smallest $j > i$ such that $l_{ji}$ is nonzero. Node $i$ has no parent if column $i$ of $\m{L}$ is completely zero below the diagonal; $i$ is a root of the elimination tree in this case. The nonzero pattern of $\m{x}$ is the union of all the nodes on the paths from any node $i$ (where $b_i$ is nonzero) to the root of the elimination tree \cite[Thm 2.4]{Liu86c}. It is referred to here as a tree, but in general it can be a forest. Rather than a general topological sort of the subgraph of $G_L$ consisting nodes reachable from nodes in ${\cal B}$, a simpler tree traversal can be used. First, select any nonzero entry $b_i$ and follow the path from $i$ to the root of tree. Nodes along this path are marked and placed in a stack, with $i$ at the top of the stack and the root at the bottom. Repeat for every other nonzero entry in $b_i$, in arbitrary order, but stop just before reaching a marked node (the result can be empty if $i$ is already in the stack). The stack now contains ${\cal X}$, a topological ordering of the nonzero pattern of $\m{x}$, which can be used in Algorithm~2 to solve $\m{Lx}=\m{b}$. The time to compute ${\cal X}$ using an elimination tree traversal is much faster than the general graph traversal, taking time proportional to the size of ${\cal X}$ rather than the number of floating-point operations required to compute $\m{x}$. In the $k$th step of the factorization, the set ${\cal X}$ becomes the nonzero pattern of row $k$ of $\m{L}$. This step requires the elimination tree of $\m{L}_{1:k-1,1:k-1}$, and must construct the elimination tree of $\m{L}_{1:k,1:k}$ for step $k+1$. Recall that the parent of $i$ in the tree is the smallest $j$ such that $i < j$ and $l_{ji} \ne 0$. Thus, if any node $i$ already has a parent $j$, then $j$ will remain the parent of $i$ in the elimination trees of all other larger leading submatrices of $\m{L}$, and in the elimination tree of $\m{L}$ itself. If $l_{ki} \ne 0$ and $i$ does not have a parent in the elimination tree of $\m{L}_{1:k-1,1:k-1}$, then the parent of $i$ is $k$ in the elimination tree of $\m{L}_{1:k,1:k}$. Node $k$ becomes the parent of any node $i \in {\cal X}$ that does not yet have a parent. Since Algorithm~2 traverses $\m{L}$ in column order, $\m{L}$ is stored in a conventional sparse column representation. Each column $j$ is stored as a list of nonzero values and their corresponding row indices. When row $k$ is computed, the new entries can be placed at the end of each list. As a by-product of computing $\m{L}$ one row at a time, the columns of $\m{L}$ are computed in a sorted manner. This is a convenient form of the output. MATLAB requires the columns of its sparse matrices to be sorted, for example. Sorted columns improve the speed of Algorithm~2, since the memory access pattern is more regular. The conventional column-by-column algorithm \cite{GeorgeLiu79,GeorgeLiu} does not produce columns of $\m{L}$ with sorted row indices. A simple symbolic pre-analysis can be obtained by repeating the subtree traversals. All that is required to compute the nonzero pattern of the $k$th row of $\m{L}$ is the partially constructed elimination tree and the nonzero pattern of the $k$th column of $\m{A}$. This is computed in time proportional to the size of this set, using the elimination tree traversal. Once constructed, the number of nonzeros in each column of $\m{L}$ is incremented, for each entry in ${\cal X}$, and then ${\cal X}$ is discarded. The set ${\cal X}$ need not be constructed in topological order, so no stack is required. The run time of the symbolic analysis algorithm is thus proportional to the number of nonzeros in $\m{L}$. This is more costly than the optimal algorithm \cite{GilbertNgPeyton94}, which takes time essentially proportional to the number of nonzeros in $\m{A}$. The memory requirements are just the matrix $\m{A}$ and a few size-$n$ integer arrays. The result of the algorithm is the elimination tree, a count of the number of nonzeros in each column of $\m{L}$, and the cumulative sum of the column counts. %------------------------------------------------------------------------------- \section{Implementation} \label{Implementation} %------------------------------------------------------------------------------- Because of its simplicity, the implementation of this algorithm leads to a very short, concise code. The symbolic analysis routine {\tt ldl\_symbolic} shown in Figure~\ref{ldlsymbolic} consists of only 18 lines of executable C code. This includes 5 lines of code to allow for a sparsity-preserving ordering $\m{P}$ so that either $\m{A}$ or $\m{PAP}\tr$ can be analyzed, 3 lines of code to compute the cumulative sum of the column counts, and one line of code to speed up a {\tt for} loop. An additional line of code allows for a more general form of the input sparse matrix $\m{A}$. The {\tt n}-by-{\tt n} sparse matrix $\m{A}$ is provided in compressed column form as an {\tt int} array {\tt Ap} of length {\tt n+1}, an {\tt int} array {\tt Ai} of length {\tt nz}, and a {\tt double} array {\tt Ax} also of length {\tt nz}, where {\tt nz} is the number of entries in the matrix. The numerical values of entries in column $j$ are stored in {\tt Ax[Ap[j]} $\ldots$ {\tt Ap[j+1]-1]} and the corresponding row indices are in {\tt Ai[Ap[j]} $\ldots$ {\tt Ap[j+1]-1]}. With {\tt Ap[0] = 0}, the number of entries in the matrix is {\tt nz = Ap[n]}. If no fill-reducing ordering {\tt P} is provided, only entries in the upper triangular part of $\m{A}$ are considered. If {\tt P} is provided and row/column {\tt i} of the matrix $\m{A}$ is the {\tt k}-th row/column of $\m{PAP}\tr$, then {\tt P[k]=i}. Only entries in the upper triangular part of $\m{PAP}\tr$ are considered. These entries may be in the lower triangular part of $\m{A}$, so to ensure that the correct matrix is factorized, all entries of $\m{A}$ should be provided when using the permutation input {\tt P}. The outputs of {\tt ldl\_symbolic} are three size-{\tt n} arrays: {\tt Parent} holds the elimination tree, {\tt Lnz} holds the counts of the number of entries in each column of $\m{L}$, and {\tt Lp} holds the cumulative sum of {\tt Lnz}. The size-{\tt n} array {\tt Flag} is used as workspace. None of the output or workspace arrays need to be initialized. \begin{figure} \caption{{\tt ldl\_symbolic:} finding the elimination tree and column counts} \label{ldlsymbolic} {\scriptsize \begin{verbatim} void ldl_symbolic ( int n, /* A and L are n-by-n, where n >= 0 */ int Ap [ ], /* input of size n+1, not modified */ int Ai [ ], /* input of size nz=Ap[n], not modified */ int Lp [ ], /* output of size n+1, not defined on input */ int Parent [ ], /* output of size n, not defined on input */ int Lnz [ ], /* output of size n, not defined on input */ int Flag [ ], /* workspace of size n, not defn. on input or output */ int P [ ], /* optional input of size n */ int Pinv [ ] /* optional output of size n (used if P is not NULL) */ ) { int i, k, p, kk, p2 ; if (P) { /* If P is present then compute Pinv, the inverse of P */ for (k = 0 ; k < n ; k++) { Pinv [P [k]] = k ; } } for (k = 0 ; k < n ; k++) { /* L(k,:) pattern: all nodes reachable in etree from nz in A(0:k-1,k) */ Parent [k] = -1 ; /* parent of k is not yet known */ Flag [k] = k ; /* mark node k as visited */ Lnz [k] = 0 ; /* count of nonzeros in column k of L */ kk = (P) ? (P [k]) : (k) ; /* kth original, or permuted, column */ p2 = Ap [kk+1] ; for (p = Ap [kk] ; p < p2 ; p++) { /* A (i,k) is nonzero (original or permuted A) */ i = (Pinv) ? (Pinv [Ai [p]]) : (Ai [p]) ; if (i < k) { /* follow path from i to root of etree, stop at flagged node */ for ( ; Flag [i] != k ; i = Parent [i]) { /* find parent of i if not yet determined */ if (Parent [i] == -1) Parent [i] = k ; Lnz [i]++ ; /* L (k,i) is nonzero */ Flag [i] = k ; /* mark i as visited */ } } } } /* construct Lp index array from Lnz column counts */ Lp [0] = 0 ; for (k = 0 ; k < n ; k++) { Lp [k+1] = Lp [k] + Lnz [k] ; } } \end{verbatim} } \end{figure} The {\tt ldl\_numeric} numeric factorization routine shown in Figure~\ref{ldlnumeric} consists of only 31 lines of executable code. It includes this same subtree traversal algorithm as {\tt ldl\_symbolic}, except that each path is placed on a stack that holds nonzero pattern of the $k$th row of $\m{L}$. This traversal is followed by a sparse forward solve using this pattern, and all of the nonzero entries in the resulting $k$th row of $\m{L}$ are appended to their respective columns in the data structure of $\m{L}$. \begin{figure} \caption{{\tt ldl\_numeric:} numeric factorization} \label{ldlnumeric} {\scriptsize \begin{verbatim} int ldl_numeric /* returns n if successful, k if D (k,k) is zero */ ( int n, /* A and L are n-by-n, where n >= 0 */ int Ap [ ], /* input of size n+1, not modified */ int Ai [ ], /* input of size nz=Ap[n], not modified */ double Ax [ ], /* input of size nz=Ap[n], not modified */ int Lp [ ], /* input of size n+1, not modified */ int Parent [ ], /* input of size n, not modified */ int Lnz [ ], /* output of size n, not defn. on input */ int Li [ ], /* output of size lnz=Lp[n], not defined on input */ double Lx [ ], /* output of size lnz=Lp[n], not defined on input */ double D [ ], /* output of size n, not defined on input */ double Y [ ], /* workspace of size n, not defn. on input or output */ int Pattern [ ], /* workspace of size n, not defn. on input or output */ int Flag [ ], /* workspace of size n, not defn. on input or output */ int P [ ], /* optional input of size n */ int Pinv [ ] /* optional input of size n */ ) { double yi, l_ki ; int i, k, p, kk, p2, len, top ; for (k = 0 ; k < n ; k++) { /* compute nonzero Pattern of kth row of L, in topological order */ Y [k] = 0.0 ; /* Y(0:k) is now all zero */ top = n ; /* stack for pattern is empty */ Flag [k] = k ; /* mark node k as visited */ Lnz [k] = 0 ; /* count of nonzeros in column k of L */ kk = (P) ? (P [k]) : (k) ; /* kth original, or permuted, column */ p2 = Ap [kk+1] ; for (p = Ap [kk] ; p < p2 ; p++) { i = (Pinv) ? (Pinv [Ai [p]]) : (Ai [p]) ; /* get A(i,k) */ if (i <= k) { Y [i] += Ax [p] ; /* scatter A(i,k) into Y (sum duplicates) */ for (len = 0 ; Flag [i] != k ; i = Parent [i]) { Pattern [len++] = i ; /* L(k,i) is nonzero */ Flag [i] = k ; /* mark i as visited */ } while (len > 0) Pattern [--top] = Pattern [--len] ; } } /* compute numerical values kth row of L (a sparse triangular solve) */ D [k] = Y [k] ; /* get D(k,k) and clear Y(k) */ Y [k] = 0.0 ; for ( ; top < n ; top++) { i = Pattern [top] ; /* Pattern [top:n-1] is pattern of L(:,k) */ yi = Y [i] ; /* get and clear Y(i) */ Y [i] = 0.0 ; p2 = Lp [i] + Lnz [i] ; for (p = Lp [i] ; p < p2 ; p++) { Y [Li [p]] -= Lx [p] * yi ; } l_ki = yi / D [i] ; /* the nonzero entry L(k,i) */ D [k] -= l_ki * yi ; Li [p] = k ; /* store L(k,i) in column form of L */ Lx [p] = l_ki ; Lnz [i]++ ; /* increment count of nonzeros in col i */ } if (D [k] == 0.0) return (k) ; /* failure, D(k,k) is zero */ } return (n) ; /* success, diagonal of D is all nonzero */ } \end{verbatim} } \end{figure} After the matrix is factorized, the {\tt ldl\_lsolve}, {\tt ldl\_dsolve}, and {\tt ldl\_ltsolve} routines shown in Figure~\ref{ldlsolve} are provided to solve $\m{Lx}=\m{b}$, $\m{Dx}=\m{b}$, and $\m{L}\tr\m{x}=\m{b}$, respectively. Together, they solve $\m{Ax}=\m{b}$, and consist of only 10 lines of executable code. If a fill-reducing permutation is used, {\tt ldl\_perm} and {\tt ldl\_permt} must be used to permute $\m{b}$ and $\m{x}$ accordingly. \begin{figure} \caption{Solve routines} \label{ldlsolve} {\scriptsize \begin{verbatim} void ldl_lsolve ( int n, /* L is n-by-n, where n >= 0 */ double X [ ], /* size n. right-hand-side on input, soln. on output */ int Lp [ ], /* input of size n+1, not modified */ int Li [ ], /* input of size lnz=Lp[n], not modified */ double Lx [ ] /* input of size lnz=Lp[n], not modified */ ) { int j, p, p2 ; for (j = 0 ; j < n ; j++) { p2 = Lp [j+1] ; for (p = Lp [j] ; p < p2 ; p++) { X [Li [p]] -= Lx [p] * X [j] ; } } } void ldl_dsolve ( int n, /* D is n-by-n, where n >= 0 */ double X [ ], /* size n. right-hand-side on input, soln. on output */ double D [ ] /* input of size n, not modified */ ) { int j ; for (j = 0 ; j < n ; j++) { X [j] /= D [j] ; } } void ldl_ltsolve ( int n, /* L is n-by-n, where n >= 0 */ double X [ ], /* size n. right-hand-side on input, soln. on output */ int Lp [ ], /* input of size n+1, not modified */ int Li [ ], /* input of size lnz=Lp[n], not modified */ double Lx [ ] /* input of size lnz=Lp[n], not modified */ ) { int j, p, p2 ; for (j = n-1 ; j >= 0 ; j--) { p2 = Lp [j+1] ; for (p = Lp [j] ; p < p2 ; p++) { X [j] -= Lx [p] * X [Li [p]] ; } } } \end{verbatim} } \end{figure} In addition to appearing as a Collected Algorithm of the ACM \cite{Davis05}, {\tt LDL} is available at http://www.suitesparse.com. %------------------------------------------------------------------------------- \section{Using LDL in MATLAB} \label{MATLAB} %------------------------------------------------------------------------------- The simplest way to use {\tt LDL} is within MATLAB. Once the {\tt ldlsparse} mexFunction is compiled and installed, the MATLAB statement {\tt [L, D, Parent, fl] = ldlsparse (A)} returns the sparse factorization {\tt A = (L+I)*D*(L+I)'}, where {\tt L} is lower triangular, {\tt D} is a diagonal matrix, and {\tt I} is the {\tt n}-by-{\tt n} identity matrix ({\tt ldlsparse} does not return the unit diagonal of {\tt L}). The elimination tree is returned in {\tt Parent}. If no zero on the diagonal of {\tt D} is encountered, {\tt fl} is the floating-point operation count. Otherwise, {\tt D(-fl,-fl)} is the first zero entry encountered. Let {\tt d=-fl}. The function returns the factorization of {\tt A (1:d,1:d)}, where rows {\tt d+1} to {\tt n} of {\tt L} and {\tt D} are all zero. If a sparsity preserving permutation {\tt P} is passed, {\tt [L, D, Parent, fl] = ldlsparse (A,P)} operates on {\tt A(P,P)} without forming it explicitly. The statement {\tt x = ldlsparse (A, [ ], b)} is roughly equivalent to {\tt x = A}$\backslash${\tt b}, when {\tt A} is sparse, real, and symmetric. The $\m{LDL}\tr$ factorization of {\tt A} is performed. If {\tt P} is provided, {\tt x = ldlsparse (A, P, b)} still performs {\tt x = A}$\backslash${\tt b}, except that {\tt A(P,P)} is factorized instead. %------------------------------------------------------------------------------- \section{Using LDL in a C program} \label{C} %------------------------------------------------------------------------------- To compile the library, a {\tt cmake} build system is provided. See the instructions in {\tt SuiteSparse/README.txt}. The C-callable {\tt LDL} library consists of nine user-callable routines and one include file. \begin{itemize} \item {\tt ldl\_symbolic}: given the nonzero pattern of a sparse symmetric matrix $\m{A}$ and an optional permutation $\m{P}$, analyzes either $\m{A}$ or $\m{PAP}\tr$, and returns the elimination tree, the number of nonzeros in each column of $\m{L}$, and the {\tt Lp} array for the sparse matrix data structure for $\m{L}$. Duplicate entries are allowed in the columns of $\m{A}$, and the row indices in each column need not be sorted. Providing a sparsity-preserving ordering is critical for obtaining good performance. A minimum degree ordering (such as AMD \cite{AmestoyDavisDuff96,AmestoyDavisDuff03}) or a graph-partitioning based ordering are appropriate. \item {\tt ldl\_numeric}: given {\tt Lp} and the elimination tree computed by {\tt ldl\_symbolic}, and an optional permutation $\m{P}$, returns the numerical factorization of $\m{A}$ or $\m{PAP}\tr$. Duplicate entries are allowed in the columns of $\m{A}$ (any duplicate entries are summed), and the row indices in each column need not be sorted. The data structure for $\m{L}$ is the same as $\m{A}$, except that no duplicates appear, and each column has sorted row indices. \item {\tt ldl\_lsolve}: given the factor $\m{L}$ computed by {\tt ldl\_numeric}, solves the linear system $\m{Lx}=\m{b}$, where $\m{x}$ and $\m{b}$ are full $n$-by-1 vectors. \item {\tt ldl\_dsolve}: given the factor $\m{D}$ computed by {\tt ldl\_numeric}, solves the linear system $\m{Dx}=\m{b}$. \item {\tt ldl\_ltsolve}: given the factor $\m{L}$ computed by {\tt ldl\_numeric}, solves the linear system $\m{L}\tr\m{x}=\m{b}$. \item {\tt ldl\_perm}: given a vector $\m{b}$ and a permutation $\m{P}$, returns $\m{x}=\m{Pb}$. \item {\tt ldl\_permt}: given a vector $\m{b}$ and a permutation $\m{P}$, returns $\m{x}=\m{P}\tr\m{b}$. \item {\tt ldl\_valid\_perm}: Except for checking if the diagonal of $\m{D}$ is zero, none of the above routines check their inputs for errors. This routine checks the validity of a permutation $\m{P}$. \item {\tt ldl\_valid\_matrix}: checks if a matrix $\m{A}$ is valid as input to {\tt ldl\_symbolic} and {\tt ldl\_numeric}. \end{itemize} Note that the primary input to the {\tt ldl\_symbolic} and {\tt ldl\_numeric} is the sparse matrix $\m{A}$. It is provided in column-oriented form, and only the upper triangular part is accessed. This is slightly different than the primary output: the matrix $\m{L}$, which is lower triangular in column-oriented form. If you wish to factorize a symmetric matrix $\m{A}$ for which only the lower triangular part is supplied, you would need to transpose $\m{A}$ before passing it {\tt ldl\_symbolic} and {\tt ldl\_numeric}. An additional set of routines is available for use in a 64-bit environment. Each routine name changes uniformly; {\tt ldl\_symbolic} becomes {\tt ldl\_l\_symbolic}, and each {\tt int} parameter becomes type {\tt SuiteSparse\_long}. The {\tt SuiteSparse\_long} type is {\tt long}, except for Microsoft Windows 64, where it becomes {\tt \_\_int64}. \begin{figure} \caption{Example of use} \label{ldlsimple} {\scriptsize \begin{verbatim} #include #include "ldl.h" #define N 10 /* A is 10-by-10 */ #define ANZ 19 /* # of nonzeros on diagonal and upper triangular part of A */ #define LNZ 13 /* # of nonzeros below the diagonal of L */ int main (void) { /* only the upper triangular part of A is required */ int Ap [N+1] = {0, 1, 2, 3, 4, 6, 7, 9, 11, 15, ANZ}, Ai [ANZ] = {0, 1, 2, 3, 1,4, 5, 4,6, 4,7, 0,4,7,8, 1,4,6,9 } ; double Ax [ANZ] = {1.7, 1., 1.5, 1.1, .02,2.6, 1.2, .16,1.3, .09,1.6, .13,.52,.11,1.4, .01,.53,.56,3.1}, b [N] = {.287, .22, .45, .44, 2.486, .72, 1.55, 1.424, 1.621, 3.759}; double Lx [LNZ], D [N], Y [N] ; int Li [LNZ], Lp [N+1], Parent [N], Lnz [N], Flag [N], Pattern [N], d, i ; /* factorize A into LDL' (P and Pinv not used) */ ldl_symbolic (N, Ap, Ai, Lp, Parent, Lnz, Flag, NULL, NULL) ; printf ("Nonzeros in L, excluding diagonal: %d\n", Lp [N]) ; d = ldl_numeric (N, Ap, Ai, Ax, Lp, Parent, Lnz, Li, Lx, D, Y, Pattern, Flag, NULL, NULL) ; if (d == N) { /* solve Ax=b, overwriting b with the solution x */ ldl_lsolve (N, b, Lp, Li, Lx) ; ldl_dsolve (N, b, D) ; ldl_ltsolve (N, b, Lp, Li, Lx) ; for (i = 0 ; i < N ; i++) printf ("x [%d] = %g\n", i, b [i]) ; } else { printf ("ldl_numeric failed, D (%d,%d) is zero\n", d, d) ; } return (0) ; } \end{verbatim} } \end{figure} The program in Figure~\ref{ldlsimple} illustrates the basic usage of the {\tt LDL} routines. It analyzes and factorizes the sparse symmetric positive-definite matrix {\small \[ \m{A} = \left[ \begin{array}{cccccccccc} 1.7 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & .13 & 0 \\ 0 & 1. & 0 & 0 & .02 & 0 & 0 & 0 & 0 & .01 \\ 0 & 0 & 1.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1.1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & .02 & 0 & 0 & 2.6 & 0 & .16 & .09 & .52 & .53 \\ 0 & 0 & 0 & 0 & 0 & 1.2 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & .16 & 0 & 1.3 & 0 & 0 & .56 \\ 0 & 0 & 0 & 0 & .09 & 0 & 0 & 1.6 & .11 & 0 \\ .13 & 0 & 0 & 0 & .52 & 0 & 0 & .11 & 1.4 & 0 \\ 0 & .01 & 0 & 0 & .53 & 0 & .56 & 0 & 0 & 3.1 \\ \end{array} \right] \] } and then solves a system $\m{Ax}=\m{b}$ whose true solution is $x_i = i/10$. Note that {\tt Li} and {\tt Lx} are statically allocated. Normally they would be allocated after their size, {\tt Lp[n]}, is determined by {\tt ldl\_symbolic}. More example programs are included with the {\tt LDL} package. \section{Acknowledgments} I would like to thank Pete Stewart for his comments on an earlier draft of this software and its accompanying paper. \newpage \bibliographystyle{plain} \bibliography{ldl} \end{document}