@node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top @chapter FFTW Reference This chapter provides a complete reference for all sequential (i.e., one-processor) FFTW functions. Parallel transforms are described in later chapters. @menu * Data Types and Files:: * Using Plans:: * Basic Interface:: * Advanced Interface:: * Guru Interface:: * New-array Execute Functions:: * Wisdom:: * What FFTW Really Computes:: @end menu @c ------------------------------------------------------------ @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference @section Data Types and Files All programs using FFTW should include its header file: @example #include @end example You must also link to the FFTW library. On Unix, this means adding @code{-lfftw3 -lm} at the @emph{end} of the link command. @menu * Complex numbers:: * Precision:: * Memory Allocation:: @end menu @c =========> @node Complex numbers, Precision, Data Types and Files, Data Types and Files @subsection Complex numbers The default FFTW interface uses @code{double} precision for all floating-point numbers, and defines a @code{fftw_complex} type to hold complex numbers as: @example typedef double fftw_complex[2]; @end example @tindex fftw_complex Here, the @code{[0]} element holds the real part and the @code{[1]} element holds the imaginary part. Alternatively, if you have a C compiler (such as @code{gcc}) that supports the C99 revision of the ANSI C standard, you can use C's new native complex type (which is binary-compatible with the typedef above). In particular, if you @code{#include } @emph{before} @code{}, then @code{fftw_complex} is defined to be the native complex type and you can manipulate it with ordinary arithmetic (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are @code{fftw_complex} and @code{I} is the standard symbol for the imaginary unit); @cindex C99 C++ has its own @code{complex} template class, defined in the standard @code{} header file. Reportedly, the C++ standards committee has recently agreed to mandate that the storage format used for this type be binary-compatible with the C99 type, i.e. an array @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]} parts. (See report @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf WG21/N1388}.) Although not part of the official standard as of this writing, the proposal stated that: ``This solution has been tested with all current major implementations of the standard library and shown to be working.'' To the extent that this is true, if you have a variable @code{complex *x}, you can pass it directly to FFTW via @code{reinterpret_cast(x)}. @cindex C++ @cindex portability @c =========> @node Precision, Memory Allocation, Complex numbers, Data Types and Files @subsection Precision @cindex precision You can install single and long-double precision versions of FFTW, which replace @code{double} with @code{float} and @code{long double}, respectively (@pxref{Installation and Customization}). To use these interfaces, you: @itemize @bullet @item Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}. (You can link to the different-precision libraries simultaneously.) @item Include the @emph{same} @code{} header file. @item Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or @samp{fftwl_} for single or long-double precision, respectively. (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute} becomes @code{fftwf_execute}, etcetera.) @item Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the same. @item Replace @code{double} with @code{float} or @code{long double} for subroutine parameters. @end itemize Depending upon your compiler and/or hardware, @code{long double} may not be any more precise than @code{double} (or may not be supported at all, although it is standard in C99). @cindex C99 We also support using the nonstandard @code{__float128} quadruple-precision type provided by recent versions of @code{gcc} on 32- and 64-bit x86 hardware (@pxref{Installation and Customization}). To use this type, link with @code{-lfftw3q -lquadmath -lm} (the @code{libquadmath} library provided by @code{gcc} is needed for quadruple-precision trigonometric functions) and use @samp{fftwq_} identifiers. @c =========> @node Memory Allocation, , Precision, Data Types and Files @subsection Memory Allocation @example void *fftw_malloc(size_t n); void fftw_free(void *p); @end example @findex fftw_malloc @findex fftw_free These are functions that behave identically to @code{malloc} and @code{free}, except that they guarantee that the returned pointer obeys any special alignment restrictions imposed by any algorithm in FFTW (e.g. for SIMD acceleration). @xref{SIMD alignment and fftw_malloc}. @cindex alignment Data allocated by @code{fftw_malloc} @emph{must} be deallocated by @code{fftw_free} and not by the ordinary @code{free}. These routines simply call through to your operating system's @code{malloc} or, if necessary, its aligned equivalent (e.g. @code{memalign}), so you normally need not worry about any significant time or space overhead. You are @emph{not required} to use them to allocate your data, but we strongly recommend it. Note: in C++, just as with ordinary @code{malloc}, you must typecast the output of @code{fftw_malloc} to whatever pointer type you are allocating. @cindex C++ We also provide the following two convenience functions to allocate real and complex arrays with @code{n} elements, which are equivalent to @code{(double *) fftw_malloc(sizeof(double) * n)} and @code{(fftw_complex *) fftw_malloc(sizeof(fftw_complex) * n)}, respectively: @example double *fftw_alloc_real(size_t n); fftw_complex *fftw_alloc_complex(size_t n); @end example @findex fftw_alloc_real @findex fftw_alloc_complex The equivalent functions in other precisions allocate arrays of @code{n} elements in that precision. e.g. @code{fftwf_alloc_real(n)} is equivalent to @code{(float *) fftwf_malloc(sizeof(float) * n)}. @cindex precision @c ------------------------------------------------------------ @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference @section Using Plans Plans for all transform types in FFTW are stored as type @code{fftw_plan} (an opaque pointer type), and are created by one of the various planning routines described in the following sections. @tindex fftw_plan An @code{fftw_plan} contains all information necessary to compute the transform, including the pointers to the input and output arrays. @example void fftw_execute(const fftw_plan plan); @end example @findex fftw_execute This executes the @code{plan}, to compute the corresponding transform on the arrays for which it was planned (which must still exist). The plan is not modified, and @code{fftw_execute} can be called as many times as desired. To apply a given plan to a different array, you can use the new-array execute interface. @xref{New-array Execute Functions}. @code{fftw_execute} (and equivalents) is the only function in FFTW guaranteed to be thread-safe; see @ref{Thread safety}. This function: @example void fftw_destroy_plan(fftw_plan plan); @end example @findex fftw_destroy_plan deallocates the @code{plan} and all its associated data. FFTW's planner saves some other persistent data, such as the accumulated wisdom and a list of algorithms available in the current configuration. If you want to deallocate all of that and reset FFTW to the pristine state it was in when you started your program, you can call: @example void fftw_cleanup(void); @end example @findex fftw_cleanup After calling @code{fftw_cleanup}, all existing plans become undefined, and you should not attempt to execute them nor to destroy them. You can however create and execute/destroy new plans, in which case FFTW starts accumulating wisdom information again. @code{fftw_cleanup} does not deallocate your plans, however. To prevent memory leaks, you must still call @code{fftw_destroy_plan} before executing @code{fftw_cleanup}. Occasionally, it may useful to know FFTW's internal ``cost'' metric that it uses to compare plans to one another; this cost is proportional to an execution time of the plan, in undocumented units, if the plan was created with the @code{FFTW_MEASURE} or other timing-based options, or alternatively is a heuristic cost function for @code{FFTW_ESTIMATE} plans. (The cost values of measured and estimated plans are not comparable, being in different units. Also, costs from different FFTW versions or the same version compiled differently may not be in the same units. Plans created from wisdom have a cost of 0 since no timing measurement is performed for them. Finally, certain problems for which only one top-level algorithm was possible may have required no measurements of the cost of the whole plan, in which case @code{fftw_cost} will also return 0.) The cost metric for a given plan is returned by: @example double fftw_cost(const fftw_plan plan); @end example @findex fftw_cost The following two routines are provided purely for academic purposes (that is, for entertainment). @example void fftw_flops(const fftw_plan plan, double *add, double *mul, double *fma); @end example @findex fftw_flops Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an exact count of the number of floating-point additions, multiplications, and fused multiply-add operations involved in the plan's execution. The total number of floating-point operations (flops) is @code{add + mul + 2*fma}, or @code{add + mul + fma} if the hardware supports fused multiply-add instructions (although the number of FMA operations is only approximate because of compiler voodoo). (The number of operations should be an integer, but we use @code{double} to avoid overflowing @code{int} for large transforms; the arguments are of type @code{double} even for single and long-double precision versions of FFTW.) @example void fftw_fprint_plan(const fftw_plan plan, FILE *output_file); void fftw_print_plan(const fftw_plan plan); char *fftw_sprint_plan(const fftw_plan plan); @end example @findex fftw_fprint_plan @findex fftw_print_plan This outputs a ``nerd-readable'' representation of the @code{plan} to the given file, to @code{stdout}, or two a newly allocated NUL-terminated string (which the caller is responsible for deallocating with @code{free}), respectively. @c ------------------------------------------------------------ @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference @section Basic Interface @cindex basic interface Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface} computes a single transform of contiguous data, the @dfn{advanced interface} computes transforms of multiple or strided arrays, and the @dfn{guru interface} supports the most general data layouts, multiplicities, and strides. This section describes the the basic interface, which we expect to satisfy the needs of most users. @menu * Complex DFTs:: * Planner Flags:: * Real-data DFTs:: * Real-data DFT Array Format:: * Real-to-Real Transforms:: * Real-to-Real Transform Kinds:: @end menu @c =========> @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface @subsection Complex DFTs @example fftw_plan fftw_plan_dft_1d(int n0, fftw_complex *in, fftw_complex *out, int sign, unsigned flags); fftw_plan fftw_plan_dft_2d(int n0, int n1, fftw_complex *in, fftw_complex *out, int sign, unsigned flags); fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2, fftw_complex *in, fftw_complex *out, int sign, unsigned flags); fftw_plan fftw_plan_dft(int rank, const int *n, fftw_complex *in, fftw_complex *out, int sign, unsigned flags); @end example @findex fftw_plan_dft_1d @findex fftw_plan_dft_2d @findex fftw_plan_dft_3d @findex fftw_plan_dft Plan a complex input/output discrete Fourier transform (DFT) in zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). Once you have created a plan for a certain transform type and parameters, then creating another plan of the same type and parameters, but for different arrays, is fast and shares constant data with the first plan (if it still exists). The planner returns @code{NULL} if the plan cannot be created. In the standard FFTW distribution, the basic interface is guaranteed to return a non-@code{NULL} plan. A plan may be @code{NULL}, however, if you are using a customized FFTW configuration supporting a restricted set of transforms. @subsubheading Arguments @itemize @bullet @item @code{rank} is the rank of the transform (it should be the size of the array @code{*n}), and can be any non-negative integer. (@xref{Complex Multi-Dimensional DFTs}, for the definition of ``rank''.) The @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a copy of one number from input to output. @item @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate for each routine) specify the size of the transform dimensions. They can be any positive integer. @itemize @minus @item @cindex row-major Multi-dimensional arrays are stored in row-major order with dimensions: @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. @xref{Multi-dimensional Array Format}. @item FFTW is best at handling sizes of the form @ifinfo @math{2^a 3^b 5^c 7^d 11^e 13^f}, @end ifinfo @tex $2^a 3^b 5^c 7^d 11^e 13^f$, @end tex @html 2a 3b 5c 7d 11e 13f, @end html where @math{e+f} is either @math{0} or @math{1}, and the other exponents are arbitrary. Other sizes are computed by means of a slow, general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). It is possible to customize FFTW for different array sizes; see @ref{Installation and Customization}. Transforms whose sizes are powers of @math{2} are especially fast. @end itemize @item @code{in} and @code{out} point to the input and output arrays of the transform, which may be the same (yielding an in-place transform). @cindex in-place These arrays are overwritten during planning, unless @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be initialized, but they must be allocated.) If @code{in == out}, the transform is @dfn{in-place} and the input array is overwritten. If @code{in != out}, the two arrays must not overlap (but FFTW does not check for this condition). @item @ctindex FFTW_FORWARD @ctindex FFTW_BACKWARD @code{sign} is the sign of the exponent in the formula that defines the Fourier transform. It can be @math{-1} (= @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). @item @cindex flags @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, as defined in @ref{Planner Flags}. @end itemize FFTW computes an unnormalized transform: computing a forward followed by a backward transform (or vice versa) will result in the original data multiplied by the size of the transform (the product of the dimensions). @cindex normalization For more information, see @ref{What FFTW Really Computes}. @c =========> @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface @subsection Planner Flags All of the planner routines in FFTW accept an integer @code{flags} argument, which is a bitwise OR (@samp{|}) of zero or more of the flag constants defined below. These flags control the rigor (and time) of the planning process, and can also impose (or lift) restrictions on the type of transform algorithm that is employed. @emph{Important:} the planner overwrites the input array during planning unless a saved plan (@pxref{Wisdom}) is available for that problem, so you should initialize your input data after creating the plan. The only exceptions to this are the @code{FFTW_ESTIMATE} and @code{FFTW_WISDOM_ONLY} flags, as mentioned below. In all cases, if wisdom is available for the given problem that was created with equal-or-greater planning rigor, then the more rigorous wisdom is used. For example, in @code{FFTW_ESTIMATE} mode any available wisdom is used, whereas in @code{FFTW_PATIENT} mode only wisdom created in patient or exhaustive mode can be used. @xref{Words of Wisdom-Saving Plans}. @subsubheading Planning-rigor flags @itemize @bullet @item @ctindex FFTW_ESTIMATE @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of different algorithms, a simple heuristic is used to pick a (probably sub-optimal) plan quickly. With this flag, the input/output arrays are not overwritten during planning. @item @ctindex FFTW_MEASURE @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually @emph{computing} several FFTs and measuring their execution time. Depending on your machine, this can take some time (often a few seconds). @code{FFTW_MEASURE} is the default planning option. @item @ctindex FFTW_PATIENT @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider range of algorithms and often produces a ``more optimal'' plan (especially for large transforms), but at the expense of several times longer planning time (especially for large transforms). @item @ctindex FFTW_EXHAUSTIVE @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an even wider range of algorithms, including many that we think are unlikely to be fast, to produce the most optimal plan but with a substantially increased planning time. @item @ctindex FFTW_WISDOM_ONLY @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan is only created if wisdom is available for the given problem, and otherwise a @code{NULL} plan is returned. This can be combined with other flags, e.g. @samp{FFTW_WISDOM_ONLY | FFTW_PATIENT} creates a plan only if wisdom is available that was created in @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode. The @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect whether wisdom is available; for example, if wisdom is not available one may wish to allocate new arrays for planning so that user data is not overwritten. @end itemize @subsubheading Algorithm-restriction flags @itemize @bullet @item @ctindex FFTW_DESTROY_INPUT @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is allowed to @emph{overwrite its input} array with arbitrary data; this can sometimes allow more efficient algorithms to be employed. @cindex out-of-place @item @ctindex FFTW_PRESERVE_INPUT @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must @emph{not change its input} array. This is ordinarily the @emph{default}, except for c2r and hc2r (i.e. complex-to-real) transforms for which @code{FFTW_DESTROY_INPUT} is the default. In the latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use algorithms that do not destroy the input, at the expense of worse performance; for multi-dimensional c2r transforms, however, no input-preserving algorithms are implemented and the planner will return @code{NULL} if one is requested. @cindex c2r @cindex hc2r @item @ctindex FFTW_UNALIGNED @cindex alignment @findex fftw_malloc @findex fftw_alignment_of @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any unusual alignment requirements on the input/output arrays (i.e. no SIMD may be used). This flag is normally @emph{not necessary}, since the planner automatically detects misaligned arrays. The only use for this flag is if you want to use the new-array execute interface to execute a given plan on a different array that may not be aligned like the original. (Using @code{fftw_malloc} makes this flag unnecessary even then. You can also use @code{fftw_alignment_of} to detect whether two arrays are equivalently aligned.) @end itemize @subsubheading Limiting planning time @example extern void fftw_set_timelimit(double seconds); @end example @findex fftw_set_timelimit This function instructs FFTW to spend at most @code{seconds} seconds (approximately) in the planner. If @code{seconds == FFTW_NO_TIMELIMIT} (the default value, which is negative), then planning time is unbounded. Otherwise, FFTW plans with a progressively wider range of algorithms until the the given time limit is reached or the given range of algorithms is explored, returning the best available plan. @ctindex FFTW_NO_TIMELIMIT For example, specifying @code{FFTW_PATIENT} first plans in @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then finally (time permitting) in @code{FFTW_PATIENT}. If @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further progress to @code{FFTW_EXHAUSTIVE} mode. Note that the @code{seconds} argument specifies only a rough limit; in practice, the planner may use somewhat more time if the time limit is reached when the planner is in the middle of an operation that cannot be interrupted. At the very least, the planner will complete planning in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit of 0). @c =========> @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface @subsection Real-data DFTs @example fftw_plan fftw_plan_dft_r2c_1d(int n0, double *in, fftw_complex *out, unsigned flags); fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1, double *in, fftw_complex *out, unsigned flags); fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2, double *in, fftw_complex *out, unsigned flags); fftw_plan fftw_plan_dft_r2c(int rank, const int *n, double *in, fftw_complex *out, unsigned flags); @end example @findex fftw_plan_dft_r2c_1d @findex fftw_plan_dft_r2c_2d @findex fftw_plan_dft_r2c_3d @findex fftw_plan_dft_r2c @cindex r2c Plan a real-input/complex-output discrete Fourier transform (DFT) in zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). Once you have created a plan for a certain transform type and parameters, then creating another plan of the same type and parameters, but for different arrays, is fast and shares constant data with the first plan (if it still exists). The planner returns @code{NULL} if the plan cannot be created. A non-@code{NULL} plan is always returned by the basic interface unless you are using a customized FFTW configuration supporting a restricted set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag with a multi-dimensional out-of-place c2r transform (see below). @subsubheading Arguments @itemize @bullet @item @code{rank} is the rank of the transform (it should be the size of the array @code{*n}), and can be any non-negative integer. (@xref{Complex Multi-Dimensional DFTs}, for the definition of ``rank''.) The @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. The rank may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a copy of one real number (with zero imaginary part) from input to output. @item @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate for each routine) specify the size of the transform dimensions. They can be any positive integer. This is different in general from the @emph{physical} array dimensions, which are described in @ref{Real-data DFT Array Format}. @itemize @minus @item FFTW is best at handling sizes of the form @ifinfo @math{2^a 3^b 5^c 7^d 11^e 13^f}, @end ifinfo @tex $2^a 3^b 5^c 7^d 11^e 13^f$, @end tex @html 2a 3b 5c 7d 11e 13f, @end html where @math{e+f} is either @math{0} or @math{1}, and the other exponents are arbitrary. Other sizes are computed by means of a slow, general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW for different array sizes; see @ref{Installation and Customization}.) Transforms whose sizes are powers of @math{2} are especially fast, and it is generally beneficial for the @emph{last} dimension of an r2c/c2r transform to be @emph{even}. @end itemize @item @code{in} and @code{out} point to the input and output arrays of the transform, which may be the same (yielding an in-place transform). @cindex in-place These arrays are overwritten during planning, unless @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be initialized, but they must be allocated.) For an in-place transform, it is important to remember that the real array will require padding, described in @ref{Real-data DFT Array Format}. @cindex padding @item @cindex flags @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, as defined in @ref{Planner Flags}. @end itemize The inverse transforms, taking complex input (storing the non-redundant half of a logically Hermitian array) to real output, are given by: @example fftw_plan fftw_plan_dft_c2r_1d(int n0, fftw_complex *in, double *out, unsigned flags); fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1, fftw_complex *in, double *out, unsigned flags); fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2, fftw_complex *in, double *out, unsigned flags); fftw_plan fftw_plan_dft_c2r(int rank, const int *n, fftw_complex *in, double *out, unsigned flags); @end example @findex fftw_plan_dft_c2r_1d @findex fftw_plan_dft_c2r_2d @findex fftw_plan_dft_c2r_3d @findex fftw_plan_dft_c2r @cindex c2r The arguments are the same as for the r2c transforms, except that the input and output data formats are reversed. FFTW computes an unnormalized transform: computing an r2c followed by a c2r transform (or vice versa) will result in the original data multiplied by the size of the transform (the product of the logical dimensions). @cindex normalization An r2c transform produces the same output as a @code{FFTW_FORWARD} complex DFT of the same input, and a c2r transform is correspondingly equivalent to @code{FFTW_BACKWARD}. For more information, see @ref{What FFTW Really Computes}. @c =========> @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface @subsection Real-data DFT Array Format @cindex r2c/c2r multi-dimensional array format The output of a DFT of real data (r2c) contains symmetries that, in principle, make half of the outputs redundant (@pxref{What FFTW Really Computes}). (Similarly for the input of an inverse c2r transform.) In practice, it is not possible to entirely realize these savings in an efficient and understandable format that generalizes to multi-dimensional transforms. Instead, the output of the r2c transforms is @emph{slightly} over half of the output of the corresponding complex transform. We do not ``pack'' the data in any way, but store it as an ordinary array of @code{fftw_complex} values. In fact, this data is simply a subsection of what would be the array in the corresponding complex transform. Specifically, for a real transform of @math{d} (= @code{rank}) dimensions @ndims{}, the complex data is an @ndimshalf array of @code{fftw_complex} values in row-major order (with the division rounded down). That is, we only store the @emph{lower} half (non-negative frequencies), plus one element, of the last dimension of the data from the ordinary complex transform. (We could have instead taken half of any other dimension, but implementation turns out to be simpler if the last, contiguous, dimension is used.) @cindex out-of-place For an out-of-place transform, the real data is simply an array with physical dimensions @ndims in row-major order. @cindex in-place @cindex padding For an in-place transform, some complications arise since the complex data is slightly larger than the real data. In this case, the final dimension of the real data must be @emph{padded} with extra values to accommodate the size of the complex data---two extra if the last dimension is even and one if it is odd. That is, the last dimension of the real data must physically contain @tex $2 (n_{d-1}/2+1)$ @end tex @ifinfo 2 * (n[d-1]/2+1) @end ifinfo @html 2 * (nd-1/2+1) @end html @code{double} values (exactly enough to hold the complex data). This physical array size does not, however, change the @emph{logical} array size---only @tex $n_{d-1}$ @end tex @ifinfo n[d-1] @end ifinfo @html nd-1 @end html values are actually stored in the last dimension, and @tex $n_{d-1}$ @end tex @ifinfo n[d-1] @end ifinfo @html nd-1 @end html is the last dimension passed to the planner. @c =========> @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface @subsection Real-to-Real Transforms @cindex r2r @example fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out, fftw_r2r_kind kind, unsigned flags); fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out, fftw_r2r_kind kind0, fftw_r2r_kind kind1, unsigned flags); fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2, double *in, double *out, fftw_r2r_kind kind0, fftw_r2r_kind kind1, fftw_r2r_kind kind2, unsigned flags); fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out, const fftw_r2r_kind *kind, unsigned flags); @end example @findex fftw_plan_r2r_1d @findex fftw_plan_r2r_2d @findex fftw_plan_r2r_3d @findex fftw_plan_r2r Plan a real input/output (r2r) transform of various kinds in zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). Once you have created a plan for a certain transform type and parameters, then creating another plan of the same type and parameters, but for different arrays, is fast and shares constant data with the first plan (if it still exists). The planner returns @code{NULL} if the plan cannot be created. A non-@code{NULL} plan is always returned by the basic interface unless you are using a customized FFTW configuration supporting a restricted set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are not defined). @ctindex FFTW_REDFT00 @subsubheading Arguments @itemize @bullet @item @code{rank} is the dimensionality of the transform (it should be the size of the arrays @code{*n} and @code{*kind}), and can be any non-negative integer. The @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a @code{rank} of @code{1}, @code{2}, and @code{3}, respectively. A @code{rank} of zero is equivalent to a copy of one number from input to output. @item @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]}, respectively, gives the (physical) size of the transform dimensions. They can be any positive integer. @itemize @minus @item @cindex row-major Multi-dimensional arrays are stored in row-major order with dimensions: @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. @xref{Multi-dimensional Array Format}. @item FFTW is generally best at handling sizes of the form @ifinfo @math{2^a 3^b 5^c 7^d 11^e 13^f}, @end ifinfo @tex $2^a 3^b 5^c 7^d 11^e 13^f$, @end tex @html 2a 3b 5c 7d 11e 13f, @end html where @math{e+f} is either @math{0} or @math{1}, and the other exponents are arbitrary. Other sizes are computed by means of a slow, general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes). (It is possible to customize FFTW for different array sizes; see @ref{Installation and Customization}.) Transforms whose sizes are powers of @math{2} are especially fast. @item For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that should be factorizable in the above form. @end itemize @item @code{in} and @code{out} point to the input and output arrays of the transform, which may be the same (yielding an in-place transform). @cindex in-place These arrays are overwritten during planning, unless @code{FFTW_ESTIMATE} is used in the flags. (The arrays need not be initialized, but they must be allocated.) @item @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or @code{kind[rank]}, is the kind of r2r transform used for the corresponding dimension. The valid kind constants are described in @ref{Real-to-Real Transform Kinds}. In a multi-dimensional transform, what is computed is the separable product formed by taking each transform kind along the corresponding dimension, one dimension after another. @item @cindex flags @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, as defined in @ref{Planner Flags}. @end itemize @c =========> @node Real-to-Real Transform Kinds, , Real-to-Real Transforms, Basic Interface @subsection Real-to-Real Transform Kinds @cindex kind (r2r) FFTW currently supports 11 different r2r transform kinds, specified by one of the constants below. For the precise definitions of these transforms, see @ref{What FFTW Really Computes}. For a more colloquial introduction to these transform kinds, see @ref{More DFTs of Real Data}. For dimension of size @code{n}, there is a corresponding ``logical'' dimension @code{N} that determines the normalization (and the optimal factorization); the formula for @code{N} is given for each kind below. Also, with each transform kind is listed its corrsponding inverse transform. FFTW computes unnormalized transforms: a transform followed by its inverse will result in the original data multiplied by @code{N} (or the product of the @code{N}'s for each dimension, in multi-dimensions). @cindex normalization @itemize @bullet @item @ctindex FFTW_R2HC @code{FFTW_R2HC} computes a real-input DFT with output in ``halfcomplex'' format, i.e. real and imaginary parts for a transform of size @code{n} stored as: @tex $$ r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1 $$ @end tex @ifinfo r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1 @end ifinfo @html

r0, r1, r2, ..., rn/2, i(n+1)/2-1, ..., i2, i1

@end html (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.) @item @ctindex FFTW_HC2R @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above. (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.) @item @ctindex FFTW_DHT @code{FFTW_DHT} computes a discrete Hartley transform. (Logical @code{N=n}, inverse is @code{FFTW_DHT}.) @cindex discrete Hartley transform @item @ctindex FFTW_REDFT00 @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I. (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.) @cindex discrete cosine transform @cindex DCT @item @ctindex FFTW_REDFT10 @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT). (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.) @item @ctindex FFTW_REDFT01 @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II). (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.) @cindex IDCT @item @ctindex FFTW_REDFT11 @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV. (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.) @item @ctindex FFTW_RODFT00 @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I. (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.) @cindex discrete sine transform @cindex DST @item @ctindex FFTW_RODFT10 @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II. (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.) @item @ctindex FFTW_RODFT01 @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III. (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.) @item @ctindex FFTW_RODFT11 @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV. (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.) @end itemize @c ------------------------------------------------------------ @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference @section Advanced Interface @cindex advanced interface FFTW's ``advanced'' interface supplements the basic interface with four new planner routines, providing a new level of flexibility: you can plan a transform of multiple arrays simultaneously, operate on non-contiguous (strided) data, and transform a subset of a larger multi-dimensional array. Other than these additional features, the planner operates in the same fashion as in the basic interface, and the resulting @code{fftw_plan} is used in the same way (@pxref{Using Plans}). @menu * Advanced Complex DFTs:: * Advanced Real-data DFTs:: * Advanced Real-to-real Transforms:: @end menu @c =========> @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface @subsection Advanced Complex DFTs @example fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany, fftw_complex *in, const int *inembed, int istride, int idist, fftw_complex *out, const int *onembed, int ostride, int odist, int sign, unsigned flags); @end example @findex fftw_plan_many_dft This routine plans multiple multidimensional complex DFTs, and it extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to compute @code{howmany} transforms, each having rank @code{rank} and size @code{n}. In addition, the transform data need not be contiguous, but it may be laid out in memory with an arbitrary stride. To account for these possibilities, @code{fftw_plan_many_dft} adds the new parameters @code{howmany}, @{@code{i},@code{o}@}@code{nembed}, @{@code{i},@code{o}@}@code{stride}, and @{@code{i},@code{o}@}@code{dist}. The FFTW basic interface (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2, and@tie{}3, but the advanced interface handles only the general-rank case. @code{howmany} is the (nonnegative) number of transforms to compute. The resulting plan computes @code{howmany} transforms, where the input of the @code{k}-th transform is at location @code{in+k*idist} (in C pointer arithmetic), and its output is at location @code{out+k*odist}. Plans obtained in this way can often be faster than calling FFTW multiple times for the individual transforms. The basic @code{fftw_plan_dft} interface corresponds to @code{howmany=1} (in which case the @code{dist} parameters are ignored). @cindex howmany parameter @cindex dist Each of the @code{howmany} transforms has rank @code{rank} and size @code{n}, as in the basic interface. In addition, the advanced interface allows the input and output arrays of each transform to be row-major subarrays of larger rank-@code{rank} arrays, described by @code{inembed} and @code{onembed} parameters, respectively. @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank}, and @code{n} should be elementwise less than or equal to @{@code{i},@code{o}@}@code{nembed}. Passing @code{NULL} for an @code{nembed} parameter is equivalent to passing @code{n} (i.e. same physical and logical dimensions, as in the basic interface.) The @code{stride} parameters indicate that the @code{j}-th element of the input or output arrays is located at @code{j*istride} or @code{j*ostride}, respectively. (For a multi-dimensional array, @code{j} is the ordinary row-major index.) When combined with the @code{k}-th transform in a @code{howmany} loop, from above, this means that the (@code{j},@code{k})-th element is at @code{j*stride+k*dist}. (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.) @cindex stride For in-place transforms, the input and output @code{stride} and @code{dist} parameters should be the same; otherwise, the planner may return @code{NULL}. Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after this function returns. You can safely free or reuse them. @strong{Examples}: One transform of one 5 by 6 array contiguous in memory: @example int rank = 2; int n[] = @{5, 6@}; int howmany = 1; int idist = odist = 0; /* unused because howmany = 1 */ int istride = ostride = 1; /* array is contiguous in memory */ int *inembed = n, *onembed = n; @end example Transform of three 5 by 6 arrays, each contiguous in memory, stored in memory one after another: @example int rank = 2; int n[] = @{5, 6@}; int howmany = 3; int idist = odist = n[0]*n[1]; /* = 30, the distance in memory between the first element of the first array and the first element of the second array */ int istride = ostride = 1; /* array is contiguous in memory */ int *inembed = n, *onembed = n; @end example Transform each column of a 2d array with 10 rows and 3 columns: @example int rank = 1; /* not 2: we are computing 1d transforms */ int n[] = @{10@}; /* 1d transforms of length 10 */ int howmany = 3; int idist = odist = 1; int istride = ostride = 3; /* distance between two elements in the same column */ int *inembed = n, *onembed = n; @end example @c =========> @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface @subsection Advanced Real-data DFTs @example fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany, double *in, const int *inembed, int istride, int idist, fftw_complex *out, const int *onembed, int ostride, int odist, unsigned flags); fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany, fftw_complex *in, const int *inembed, int istride, int idist, double *out, const int *onembed, int ostride, int odist, unsigned flags); @end example @findex fftw_plan_many_dft_r2c @findex fftw_plan_many_dft_c2r Like @code{fftw_plan_many_dft}, these two functions add @code{howmany}, @code{nembed}, @code{stride}, and @code{dist} parameters to the @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but otherwise behave the same as the basic interface. The interpretation of @code{howmany}, @code{stride}, and @code{dist} are the same as for @code{fftw_plan_many_dft}, above. Note that the @code{stride} and @code{dist} for the real array are in units of @code{double}, and for the complex array are in units of @code{fftw_complex}. If an @code{nembed} parameter is @code{NULL}, it is interpreted as what it would be in the basic interface, as described in @ref{Real-data DFT Array Format}. That is, for the complex array the size is assumed to be the same as @code{n}, but with the last dimension cut roughly in half. For the real array, the size is assumed to be @code{n} if the transform is out-of-place, or @code{n} with the last dimension ``padded'' if the transform is in-place. If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as the physical size of the corresponding array, in row-major order, just as for @code{fftw_plan_many_dft}. In this case, each dimension of @code{nembed} should be @code{>=} what it would be in the basic interface (e.g. the halved or padded @code{n}). Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after this function returns. You can safely free or reuse them. @c =========> @node Advanced Real-to-real Transforms, , Advanced Real-data DFTs, Advanced Interface @subsection Advanced Real-to-real Transforms @example fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany, double *in, const int *inembed, int istride, int idist, double *out, const int *onembed, int ostride, int odist, const fftw_r2r_kind *kind, unsigned flags); @end example @findex fftw_plan_many_r2r Like @code{fftw_plan_many_dft}, this functions adds @code{howmany}, @code{nembed}, @code{stride}, and @code{dist} parameters to the @code{fftw_plan_r2r} function, but otherwise behave the same as the basic interface. The interpretation of those additional parameters are the same as for @code{fftw_plan_many_dft}. (Of course, the @code{stride} and @code{dist} parameters are now in units of @code{double}, not @code{fftw_complex}.) Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not used after this function returns. You can safely free or reuse them. @c ------------------------------------------------------------ @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference @section Guru Interface @cindex guru interface The ``guru'' interface to FFTW is intended to expose as much as possible of the flexibility in the underlying FFTW architecture. It allows one to compute multi-dimensional ``vectors'' (loops) of multi-dimensional transforms, where each vector/transform dimension has an independent size and stride. @cindex vector One can also use more general complex-number formats, e.g. separate real and imaginary arrays. For those users who require the flexibility of the guru interface, it is important that they pay special attention to the documentation lest they shoot themselves in the foot. @menu * Interleaved and split arrays:: * Guru vector and transform sizes:: * Guru Complex DFTs:: * Guru Real-data DFTs:: * Guru Real-to-real Transforms:: * 64-bit Guru Interface:: @end menu @c =========> @node Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface @subsection Interleaved and split arrays The guru interface supports two representations of complex numbers, which we call the interleaved and the split format. The @dfn{interleaved} format is the same one used by the basic and advanced interfaces, and it is documented in @ref{Complex numbers}. In the interleaved format, you provide pointers to the real part of a complex number, and the imaginary part understood to be stored in the next memory location. @cindex interleaved format The @dfn{split} format allows separate pointers to the real and imaginary parts of a complex array. @cindex split format Technically, the interleaved format is redundant, because you can always express an interleaved array in terms of a split array with appropriate pointers and strides. On the other hand, the interleaved format is simpler to use, and it is common in practice. Hence, FFTW supports it as a special case. @c =========> @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface @subsection Guru vector and transform sizes The guru interface introduces one basic new data structure, @code{fftw_iodim}, that is used to specify sizes and strides for multi-dimensional transforms and vectors: @example typedef struct @{ int n; int is; int os; @} fftw_iodim; @end example @tindex fftw_iodim Here, @code{n} is the size of the dimension, and @code{is} and @code{os} are the strides of that dimension for the input and output arrays. (The stride is the separation of consecutive elements along this dimension.) The meaning of the stride parameter depends on the type of the array that the stride refers to. @emph{If the array is interleaved complex, strides are expressed in units of complex numbers (@code{fftw_complex}). If the array is split complex or real, strides are expressed in units of real numbers (@code{double}).} This convention is consistent with the usual pointer arithmetic in the C language. An interleaved array is denoted by a pointer @code{p} to @code{fftw_complex}, so that @code{p+1} points to the next complex number. Split arrays are denoted by pointers to @code{double}, in which case pointer arithmetic operates in units of @code{sizeof(double)}. @cindex stride The guru planner interfaces all take a (@code{rank}, @code{dims[rank]}) pair describing the transform size, and a (@code{howmany_rank}, @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a multi-dimensional loop of transforms to perform), where @code{dims} and @code{howmany_dims} are arrays of @code{fftw_iodim}. Each @code{n} field must be positive for @code{dims} and nonnegative for @code{howmany_dims}, while both @code{rank} and @code{howmany_rank} must be nonnegative. For example, the @code{howmany} parameter in the advanced complex-DFT interface corresponds to @code{howmany_rank} = 1, @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} = @code{idist}, and @code{howmany_dims[0].os} = @code{odist}. @cindex howmany loop @cindex dist (To compute a single transform, you can just use @code{howmany_rank} = 0.) A row-major multidimensional array with dimensions @code{n[rank]} (@pxref{Row-major Format}) corresponds to @code{dims[i].n} = @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] * dims[i+1].is} (similarly for @code{os}). The stride of the last (@code{i=rank-1}) dimension is the overall stride of the array. e.g. to be equivalent to the advanced complex-DFT interface, you would have @code{dims[rank-1].is} = @code{istride} and @code{dims[rank-1].os} = @code{ostride}. @cindex row-major In general, we only guarantee FFTW to return a non-@code{NULL} plan if the vector and transform dimensions correspond to a set of distinct indices, and for in-place transforms the input/output strides should be the same. @c =========> @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface @subsection Guru Complex DFTs @example fftw_plan fftw_plan_guru_dft( int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, fftw_complex *in, fftw_complex *out, int sign, unsigned flags); fftw_plan fftw_plan_guru_split_dft( int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, double *ri, double *ii, double *ro, double *io, unsigned flags); @end example @findex fftw_plan_guru_dft @findex fftw_plan_guru_split_dft These two functions plan a complex-data, multi-dimensional DFT for the interleaved and split format, respectively. Transform dimensions are given by (@code{rank}, @code{dims}) over a multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point to @code{fftw_iodim} arrays of length @code{rank} and @code{howmany_rank}, respectively. @cindex flags @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, as defined in @ref{Planner Flags}. In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and @code{out} point to the interleaved input and output arrays, respectively. The sign can be either @math{-1} (= @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}). If the pointers are equal, the transform is in-place. In the @code{fftw_plan_guru_split_dft} function, @code{ri} and @code{ii} point to the real and imaginary input arrays, and @code{ro} and @code{io} point to the real and imaginary output arrays. The input and output pointers may be the same, indicating an in-place transform. For example, for @code{fftw_complex} pointers @code{in} and @code{out}, the corresponding parameters are: @example ri = (double *) in; ii = (double *) in + 1; ro = (double *) out; io = (double *) out + 1; @end example Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides are expressed in units of @code{double}. For a contiguous @code{fftw_complex} array, the overall stride of the transform should be 2, the distance between consecutive real parts or between consecutive imaginary parts; see @ref{Guru vector and transform sizes}. Note that the dimension strides are applied equally to the real and imaginary parts; real and imaginary arrays with different strides are not supported. There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}. This function always plans for an @code{FFTW_FORWARD} transform. To plan for an @code{FFTW_BACKWARD} transform, you can exploit the identity that the backwards DFT is equal to the forwards DFT with the real and imaginary parts swapped. For example, in the case of the @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform is computed by the parameters: @example ri = (double *) in + 1; ii = (double *) in; ro = (double *) out + 1; io = (double *) out; @end example @c =========> @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface @subsection Guru Real-data DFTs @example fftw_plan fftw_plan_guru_dft_r2c( int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, double *in, fftw_complex *out, unsigned flags); fftw_plan fftw_plan_guru_split_dft_r2c( int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, double *in, double *ro, double *io, unsigned flags); fftw_plan fftw_plan_guru_dft_c2r( int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, fftw_complex *in, double *out, unsigned flags); fftw_plan fftw_plan_guru_split_dft_c2r( int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, double *ri, double *ii, double *out, unsigned flags); @end example @findex fftw_plan_guru_dft_r2c @findex fftw_plan_guru_split_dft_r2c @findex fftw_plan_guru_dft_c2r @findex fftw_plan_guru_split_dft_c2r Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with transform dimensions given by (@code{rank}, @code{dims}) over a multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point to @code{fftw_iodim} arrays of length @code{rank} and @code{howmany_rank}, respectively. As for the basic and advanced interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform is @code{FFTW_BACKWARD}. The @emph{last} dimension of @code{dims} is interpreted specially: that dimension of the real array has size @code{dims[rank-1].n}, but that dimension of the complex array has size @code{dims[rank-1].n/2+1} (division rounded down). The strides, on the other hand, are taken to be exactly as specified. It is up to the user to specify the strides appropriately for the peculiar dimensions of the data, and we do not guarantee that the planner will succeed (return non-@code{NULL}) for any dimensions other than those described in @ref{Real-data DFT Array Format} and generalized in @ref{Advanced Real-data DFTs}. (That is, for an in-place transform, each individual dimension should be able to operate in place.) @cindex in-place @code{in} and @code{out} point to the input and output arrays for r2c and c2r transforms, respectively. For split arrays, @code{ri} and @code{ii} point to the real and imaginary input arrays for a c2r transform, and @code{ro} and @code{io} point to the real and imaginary output arrays for an r2c transform. @code{in} and @code{ro} or @code{ri} and @code{out} may be the same, indicating an in-place transform. (In-place transforms where @code{in} and @code{io} or @code{ii} and @code{out} are the same are not currently supported.) @cindex flags @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, as defined in @ref{Planner Flags}. In-place transforms of rank greater than 1 are currently only supported for interleaved arrays. For split arrays, the planner will return @code{NULL}. @cindex in-place @c =========> @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface @subsection Guru Real-to-real Transforms @example fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims, int howmany_rank, const fftw_iodim *howmany_dims, double *in, double *out, const fftw_r2r_kind *kind, unsigned flags); @end example @findex fftw_plan_guru_r2r Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD} transform with transform dimensions given by (@code{rank}, @code{dims}) over a multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, @code{howmany_dims}). @code{dims} and @code{howmany_dims} should point to @code{fftw_iodim} arrays of length @code{rank} and @code{howmany_rank}, respectively. The transform kind of each dimension is given by the @code{kind} parameter, which should point to an array of length @code{rank}. Valid @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform Kinds}. @code{in} and @code{out} point to the real input and output arrays; they may be the same, indicating an in-place transform. @cindex flags @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, as defined in @ref{Planner Flags}. @c =========> @node 64-bit Guru Interface, , Guru Real-to-real Transforms, Guru Interface @subsection 64-bit Guru Interface @cindex 64-bit architecture When compiled in 64-bit mode on a 64-bit architecture (where addresses are 64 bits wide), FFTW uses 64-bit quantities internally for all transform sizes, strides, and so on---you don't have to do anything special to exploit this. However, in the ordinary FFTW interfaces, you specify the transform size by an @code{int} quantity, which is normally only 32 bits wide. This means that, even though FFTW is using 64-bit sizes internally, you cannot specify a single transform dimension larger than @ifinfo 2^31-1 @end ifinfo @html 231−1 @end html @tex $2^{31}-1$ @end tex numbers. We expect that few users will require transforms larger than this, but, for those who do, we provide a 64-bit version of the guru interface in which all sizes are specified as integers of type @code{ptrdiff_t} instead of @code{int}. (@code{ptrdiff_t} is a signed integer type defined by the C standard to be wide enough to represent address differences, and thus must be at least 64 bits wide on a 64-bit machine.) We stress that there is @emph{no performance advantage} to using this interface---the same internal FFTW code is employed regardless---and it is only necessary if you want to specify very large transform sizes. @tindex ptrdiff_t In particular, the 64-bit guru interface is a set of planner routines that are exactly the same as the guru planner routines, except that they are named with @samp{guru64} instead of @samp{guru} and they take arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}. For example, instead of @code{fftw_plan_guru_dft}, we have @code{fftw_plan_guru64_dft}. @example fftw_plan fftw_plan_guru64_dft( int rank, const fftw_iodim64 *dims, int howmany_rank, const fftw_iodim64 *howmany_dims, fftw_complex *in, fftw_complex *out, int sign, unsigned flags); @end example @findex fftw_plan_guru64_dft The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the same interpretation, except that it uses type @code{ptrdiff_t} instead of type @code{int}. @example typedef struct @{ ptrdiff_t n; ptrdiff_t is; ptrdiff_t os; @} fftw_iodim64; @end example @tindex fftw_iodim64 Every other @samp{fftw_plan_guru} function also has a @samp{fftw_plan_guru64} equivalent, but we do not repeat their documentation here since they are identical to the 32-bit versions except as noted above. @c ----------------------------------------------------------- @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference @section New-array Execute Functions @cindex execute @cindex new-array execution Normally, one executes a plan for the arrays with which the plan was created, by calling @code{fftw_execute(plan)} as described in @ref{Using Plans}. @findex fftw_execute However, it is possible for sophisticated users to apply a given plan to a @emph{different} array using the ``new-array execute'' functions detailed below, provided that the following conditions are met: @itemize @bullet @item The array size, strides, etcetera are the same (since those are set by the plan). @item The input and output arrays are the same (in-place) or different (out-of-place) if the plan was originally created to be in-place or out-of-place, respectively. @item For split arrays, the separations between the real and imaginary parts, @code{ii-ri} and @code{io-ro}, are the same as they were for the input and output arrays when the plan was created. (This condition is automatically satisfied for interleaved arrays.) @item The @dfn{alignment} of the new input/output arrays is the same as that of the input/output arrays when the plan was created, unless the plan was created with the @code{FFTW_UNALIGNED} flag. @ctindex FFTW_UNALIGNED Here, the alignment is a platform-dependent quantity (for example, it is the address modulo 16 if SSE SIMD instructions are used, but the address modulo 4 for non-SIMD single-precision FFTW on the same machine). In general, only arrays allocated with @code{fftw_malloc} are guaranteed to be equally aligned (@pxref{SIMD alignment and fftw_malloc}). @end itemize @cindex alignment The alignment issue is especially critical, because if you don't use @code{fftw_malloc} then you may have little control over the alignment of arrays in memory. For example, neither the C++ @code{new} function nor the Fortran @code{allocate} statement provide strong enough guarantees about data alignment. If you don't use @code{fftw_malloc}, therefore, you probably have to use @code{FFTW_UNALIGNED} (which disables most SIMD support). If possible, it is probably better for you to simply create multiple plans (creating a new plan is quick once one exists for a given size), or better yet re-use the same array for your transforms. @findex fftw_alignment_of For rare circumstances in which you cannot control the alignment of allocated memory, but wish to determine where a given array is aligned like the original array for which a plan was created, you can use the @code{fftw_alignment_of} function: @example int fftw_alignment_of(double *p); @end example Two arrays have equivalent alignment (for the purposes of applying a plan) if and only if @code{fftw_alignment_of} returns the same value for the corresponding pointers to their data (typecast to @code{double*} if necessary). If you are tempted to use the new-array execute interface because you want to transform a known bunch of arrays of the same size, you should probably go use the advanced interface instead (@pxref{Advanced Interface})). The new-array execute functions are: @example void fftw_execute_dft( const fftw_plan p, fftw_complex *in, fftw_complex *out); void fftw_execute_split_dft( const fftw_plan p, double *ri, double *ii, double *ro, double *io); void fftw_execute_dft_r2c( const fftw_plan p, double *in, fftw_complex *out); void fftw_execute_split_dft_r2c( const fftw_plan p, double *in, double *ro, double *io); void fftw_execute_dft_c2r( const fftw_plan p, fftw_complex *in, double *out); void fftw_execute_split_dft_c2r( const fftw_plan p, double *ri, double *ii, double *out); void fftw_execute_r2r( const fftw_plan p, double *in, double *out); @end example @findex fftw_execute_dft @findex fftw_execute_split_dft @findex fftw_execute_dft_r2c @findex fftw_execute_split_dft_r2c @findex fftw_execute_dft_c2r @findex fftw_execute_split_dft_c2r @findex fftw_execute_r2r These execute the @code{plan} to compute the corresponding transform on the input/output arrays specified by the subsequent arguments. The input/output array arguments have the same meanings as the ones passed to the guru planner routines in the preceding sections. The @code{plan} is not modified, and these routines can be called as many times as desired, or intermixed with calls to the ordinary @code{fftw_execute}. The @code{plan} @emph{must} have been created for the transform type corresponding to the execute function, e.g. it must be a complex-DFT plan for @code{fftw_execute_dft}. Any of the planner routines for that transform type, from the basic to the guru interface, could have been used to create the plan, however. @c ------------------------------------------------------------ @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference @section Wisdom @cindex wisdom @cindex saving plans to disk This section documents the FFTW mechanism for saving and restoring plans from disk. This mechanism is called @dfn{wisdom}. @menu * Wisdom Export:: * Wisdom Import:: * Forgetting Wisdom:: * Wisdom Utilities:: @end menu @c =========> @node Wisdom Export, Wisdom Import, Wisdom, Wisdom @subsection Wisdom Export @example int fftw_export_wisdom_to_filename(const char *filename); void fftw_export_wisdom_to_file(FILE *output_file); char *fftw_export_wisdom_to_string(void); void fftw_export_wisdom(void (*write_char)(char c, void *), void *data); @end example @findex fftw_export_wisdom @findex fftw_export_wisdom_to_filename @findex fftw_export_wisdom_to_file @findex fftw_export_wisdom_to_string These functions allow you to export all currently accumulated wisdom in a form from which it can be later imported and restored, even during a separate run of the program. (@xref{Words of Wisdom-Saving Plans}.) The current store of wisdom is not affected by calling any of these routines. @code{fftw_export_wisdom} exports the wisdom to any output medium, as specified by the callback function @code{write_char}. @code{write_char} is a @code{putc}-like function that writes the character @code{c} to some output; its second parameter is the @code{data} pointer passed to @code{fftw_export_wisdom}. For convenience, the following three ``wrapper'' routines are provided: @code{fftw_export_wisdom_to_filename} writes wisdom to a file named @code{filename} (which is created or overwritten), returning @code{1} on success and @code{0} on failure. A lower-level function, which requires you to open and close the file yourself (e.g. if you want to write wisdom to a portion of a larger file) is @code{fftw_export_wisdom_to_file}. This writes the wisdom to the current position in @code{output_file}, which should be open with write permission; upon exit, the file remains open and is positioned at the end of the wisdom data. @code{fftw_export_wisdom_to_string} returns a pointer to a @code{NULL}-terminated string holding the wisdom data. This string is dynamically allocated, and it is the responsibility of the caller to deallocate it with @code{free} when it is no longer needed. All of these routines export the wisdom in the same format, which we will not document here except to say that it is LISP-like ASCII text that is insensitive to white space. @c =========> @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom @subsection Wisdom Import @example int fftw_import_system_wisdom(void); int fftw_import_wisdom_from_filename(const char *filename); int fftw_import_wisdom_from_string(const char *input_string); int fftw_import_wisdom(int (*read_char)(void *), void *data); @end example @findex fftw_import_wisdom @findex fftw_import_system_wisdom @findex fftw_import_wisdom_from_filename @findex fftw_import_wisdom_from_file @findex fftw_import_wisdom_from_string These functions import wisdom into a program from data stored by the @code{fftw_export_wisdom} functions above. (@xref{Words of Wisdom-Saving Plans}.) The imported wisdom replaces any wisdom already accumulated by the running program. @code{fftw_import_wisdom} imports wisdom from any input medium, as specified by the callback function @code{read_char}. @code{read_char} is a @code{getc}-like function that returns the next character in the input; its parameter is the @code{data} pointer passed to @code{fftw_import_wisdom}. If the end of the input data is reached (which should never happen for valid data), @code{read_char} should return @code{EOF} (as defined in @code{}). For convenience, the following three ``wrapper'' routines are provided: @code{fftw_import_wisdom_from_filename} reads wisdom from a file named @code{filename}. A lower-level function, which requires you to open and close the file yourself (e.g. if you want to read wisdom from a portion of a larger file) is @code{fftw_import_wisdom_from_file}. This reads wisdom from the current position in @code{input_file} (which should be open with read permission); upon exit, the file remains open, but the position of the read pointer is unspecified. @code{fftw_import_wisdom_from_string} reads wisdom from the @code{NULL}-terminated string @code{input_string}. @code{fftw_import_system_wisdom} reads wisdom from an implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix and GNU systems). @cindex wisdom, system-wide The return value of these import routines is @code{1} if the wisdom was read successfully and @code{0} otherwise. Note that, in all of these functions, any data in the input stream past the end of the wisdom data is simply ignored. @c =========> @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom @subsection Forgetting Wisdom @example void fftw_forget_wisdom(void); @end example @findex fftw_forget_wisdom Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom} to be discarded and its associated memory to be freed. (New @code{wisdom} can still be gathered subsequently, however.) @c =========> @node Wisdom Utilities, , Forgetting Wisdom, Wisdom @subsection Wisdom Utilities FFTW includes two standalone utility programs that deal with wisdom. We merely summarize them here, since they come with their own @code{man} pages for Unix and GNU systems (with HTML versions on our web site). The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in single precision, etcetera), which can be used to create a wisdom file containing plans for any of the transform sizes and types supported by FFTW. It is preferable to create wisdom directly from your executable (@pxref{Caveats in Using Wisdom}), but this program is useful for creating global wisdom files for @code{fftw_import_system_wisdom}. @cindex fftw-wisdom utility The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom file as input and produces a @dfn{configuration routine} as output. The latter is a C subroutine that you can compile and link into your program, replacing a routine of the same name in the FFTW library, that determines which parts of FFTW are callable by your program. @code{fftw-wisdom-to-conf} produces a configuration routine that links to only those parts of FFTW needed by the saved plans in the wisdom, greatly reducing the size of statically linked executables (which should only attempt to create plans corresponding to those in the wisdom, however). @cindex fftw-wisdom-to-conf utility @cindex configuration routines @c ------------------------------------------------------------ @node What FFTW Really Computes, , Wisdom, FFTW Reference @section What FFTW Really Computes In this section, we provide precise mathematical definitions for the transforms that FFTW computes. These transform definitions are fairly standard, but some authors follow slightly different conventions for the normalization of the transform (the constant factor in front) and the sign of the complex exponent. We begin by presenting the one-dimensional (1d) transform definitions, and then give the straightforward extension to multi-dimensional transforms. @menu * The 1d Discrete Fourier Transform (DFT):: * The 1d Real-data DFT:: * 1d Real-even DFTs (DCTs):: * 1d Real-odd DFTs (DSTs):: * 1d Discrete Hartley Transforms (DHTs):: * Multi-dimensional Transforms:: @end menu @c =========> @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes @subsection The 1d Discrete Fourier Transform (DFT) @cindex discrete Fourier transform @cindex DFT The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a 1d complex array @math{X} of size @math{n} computes an array @math{Y}, where: @tex $$ Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . $$ @end tex @ifinfo @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . @end ifinfo @html
.
@end html The backward (@code{FFTW_BACKWARD}) DFT computes: @tex $$ Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . $$ @end tex @ifinfo @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . @end ifinfo @html
.
@end html @cindex normalization FFTW computes an unnormalized transform, in that there is no coefficient in front of the summation in the DFT. In other words, applying the forward and then the backward transform will multiply the input by @math{n}. @cindex frequency From above, an @code{FFTW_FORWARD} transform corresponds to a sign of @math{-1} in the exponent of the DFT. Note also that we use the standard ``in-order'' output ordering---the @math{k}-th output corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T} is your total sampling period). For those who like to think in terms of positive and negative frequencies, this means that the positive frequencies are stored in the first half of the output and the negative frequencies are stored in backwards order in the second half of the output. (The frequency @math{-k/n} is the same as the frequency @math{(n-k)/n}.) @c =========> @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes @subsection The 1d Real-data DFT The real-input (r2c) DFT in FFTW computes the @emph{forward} transform @math{Y} of the size @code{n} real array @math{X}, exactly as defined above, i.e. @tex $$ Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . $$ @end tex @ifinfo @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . @end ifinfo @html
.
@end html This output array @math{Y} can easily be shown to possess the ``Hermitian'' symmetry @cindex Hermitian @tex $Y_k = Y_{n-k}^*$, @end tex @ifinfo Y[k] = Y[n-k]*, @end ifinfo @html Yk = Yn-k*, @end html where we take @math{Y} to be periodic so that @tex $Y_n = Y_0$. @end tex @ifinfo Y[n] = Y[0]. @end ifinfo @html Yn = Y0. @end html As a result of this symmetry, half of the output @math{Y} is redundant (being the complex conjugate of the other half), and so the 1d r2c transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y} (@math{n/2+1} complex numbers), where the division by @math{2} is rounded down. Moreover, the Hermitian symmetry implies that @tex $Y_0$ @end tex @ifinfo Y[0] @end ifinfo @html Y0 @end html and, if @math{n} is even, the @tex $Y_{n/2}$ @end tex @ifinfo Y[n/2] @end ifinfo @html Yn/2 @end html element, are purely real. So, for the @code{R2HC} r2r transform, the halfcomplex format does not store the imaginary parts of these elements. @cindex r2r @ctindex R2HC @cindex halfcomplex format The c2r and @code{H2RC} r2r transforms compute the backward DFT of the @emph{complex} array @math{X} with Hermitian symmetry, stored in the r2c/@code{R2HC} output formats, respectively, where the backward transform is defined exactly as for the complex case: @tex $$ Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . $$ @end tex @ifinfo @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . @end ifinfo @html
.
@end html The outputs @code{Y} of this transform can easily be seen to be purely real, and are stored as an array of real numbers. @cindex normalization Like FFTW's complex DFT, these transforms are unnormalized. In other words, applying the real-to-complex (forward) and then the complex-to-real (backward) transform will multiply the input by @math{n}. @c =========> @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes @subsection 1d Real-even DFTs (DCTs) The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized forward (and backward) DFTs as defined above, where the input array @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry. In this case, the output array is likewise real and even symmetry. @cindex real-even DFT @cindex REDFT @ctindex REDFT00 For the case of @code{REDFT00}, this even symmetry means that @tex $X_j = X_{N-j}$, @end tex @ifinfo X[j] = X[N-j], @end ifinfo @html Xj = XN-j, @end html where we take @math{X} to be periodic so that @tex $X_N = X_0$. @end tex @ifinfo X[N] = X[0]. @end ifinfo @html XN = X0. @end html Because of this redundancy, only the first @math{n} real numbers are actually stored, where @math{N = 2(n-1)}. The proper definition of even symmetry for @code{REDFT10}, @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate because of the shifts by @math{1/2} of the input and/or output, although the corresponding boundary conditions are given in @ref{Real even/odd DFTs (cosine/sine transforms)}. Because of the even symmetry, however, the sine terms in the DFT all cancel and the remaining cosine terms are written explicitly below. This formulation often leads people to call such a transform a @dfn{discrete cosine transform} (DCT), although it is really just a special case of the DFT. @cindex discrete cosine transform @cindex DCT In each of the definitions below, we transform a real array @math{X} of length @math{n} to a real array @math{Y} of length @math{n}: @subsubheading REDFT00 (DCT-I) @ctindex REDFT00 An @code{REDFT00} transform (type-I DCT) in FFTW is defined by: @tex $$ Y_k = X_0 + (-1)^k X_{n-1} + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)]. $$ @end tex @ifinfo Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))). @end ifinfo @html
.
@end html Note that this transform is not defined for @math{n=1}. For @math{n=2}, the summation term above is dropped as you might expect. @subsubheading REDFT10 (DCT-II) @ctindex REDFT10 An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by: @tex $$ Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n]. $$ @end tex @ifinfo Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)). @end ifinfo @html
.
@end html @subsubheading REDFT01 (DCT-III) @ctindex REDFT01 An @code{REDFT01} transform (type-III DCT) in FFTW is defined by: @tex $$ Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n]. $$ @end tex @ifinfo Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)). @end ifinfo @html
.
@end html In the case of @math{n=1}, this reduces to @tex $Y_0 = X_0$. @end tex @ifinfo Y[0] = X[0]. @end ifinfo @html Y0 = X0. @end html Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''. @cindex IDCT @subsubheading REDFT11 (DCT-IV) @ctindex REDFT11 An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by: @tex $$ Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n]. $$ @end tex @ifinfo Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)). @end ifinfo @html
.
@end html @subsubheading Inverses and Normalization These definitions correspond directly to the unnormalized DFTs used elsewhere in FFTW (hence the factors of @math{2} in front of the summations). The unnormalized inverse of @code{REDFT00} is @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and of @code{REDFT11} is @code{REDFT11}. Each unnormalized inverse results in the original array multiplied by @math{N}, where @math{N} is the @emph{logical} DFT size. For @code{REDFT00}, @math{N=2(n-1)} (note that @math{n=1} is not defined); otherwise, @math{N=2n}. @cindex normalization In defining the discrete cosine transform, some authors also include additional factors of @ifinfo sqrt(2) @end ifinfo @html √2 @end html @tex $\sqrt{2}$ @end tex (or its inverse) multiplying selected inputs and/or outputs. This is a mostly cosmetic change that makes the transform orthogonal, but sacrifices the direct equivalence to a symmetric DFT. @c =========> @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes @subsection 1d Real-odd DFTs (DSTs) The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized forward (and backward) DFTs as defined above, where the input array @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry. In this case, the output is odd symmetry and purely imaginary. @cindex real-odd DFT @cindex RODFT @ctindex RODFT00 For the case of @code{RODFT00}, this odd symmetry means that @tex $X_j = -X_{N-j}$, @end tex @ifinfo X[j] = -X[N-j], @end ifinfo @html Xj = -XN-j, @end html where we take @math{X} to be periodic so that @tex $X_N = X_0$. @end tex @ifinfo X[N] = X[0]. @end ifinfo @html XN = X0. @end html Because of this redundancy, only the first @math{n} real numbers starting at @math{j=1} are actually stored (the @math{j=0} element is zero), where @math{N = 2(n+1)}. The proper definition of odd symmetry for @code{RODFT10}, @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate because of the shifts by @math{1/2} of the input and/or output, although the corresponding boundary conditions are given in @ref{Real even/odd DFTs (cosine/sine transforms)}. Because of the odd symmetry, however, the cosine terms in the DFT all cancel and the remaining sine terms are written explicitly below. This formulation often leads people to call such a transform a @dfn{discrete sine transform} (DST), although it is really just a special case of the DFT. @cindex discrete sine transform @cindex DST In each of the definitions below, we transform a real array @math{X} of length @math{n} to a real array @math{Y} of length @math{n}: @subsubheading RODFT00 (DST-I) @ctindex RODFT00 An @code{RODFT00} transform (type-I DST) in FFTW is defined by: @tex $$ Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)]. $$ @end tex @ifinfo Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))). @end ifinfo @html
.
@end html @subsubheading RODFT10 (DST-II) @ctindex RODFT10 An @code{RODFT10} transform (type-II DST) in FFTW is defined by: @tex $$ Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n]. $$ @end tex @ifinfo Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)). @end ifinfo @html
.
@end html @subsubheading RODFT01 (DST-III) @ctindex RODFT01 An @code{RODFT01} transform (type-III DST) in FFTW is defined by: @tex $$ Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n]. $$ @end tex @ifinfo Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)). @end ifinfo @html
.
@end html In the case of @math{n=1}, this reduces to @tex $Y_0 = X_0$. @end tex @ifinfo Y[0] = X[0]. @end ifinfo @html Y0 = X0. @end html @subsubheading RODFT11 (DST-IV) @ctindex RODFT11 An @code{RODFT11} transform (type-IV DST) in FFTW is defined by: @tex $$ Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n]. $$ @end tex @ifinfo Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)). @end ifinfo @html
.
@end html @subsubheading Inverses and Normalization These definitions correspond directly to the unnormalized DFTs used elsewhere in FFTW (hence the factors of @math{2} in front of the summations). The unnormalized inverse of @code{RODFT00} is @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and of @code{RODFT11} is @code{RODFT11}. Each unnormalized inverse results in the original array multiplied by @math{N}, where @math{N} is the @emph{logical} DFT size. For @code{RODFT00}, @math{N=2(n+1)}; otherwise, @math{N=2n}. @cindex normalization In defining the discrete sine transform, some authors also include additional factors of @ifinfo sqrt(2) @end ifinfo @html √2 @end html @tex $\sqrt{2}$ @end tex (or its inverse) multiplying selected inputs and/or outputs. This is a mostly cosmetic change that makes the transform orthogonal, but sacrifices the direct equivalence to an antisymmetric DFT. @c =========> @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes @subsection 1d Discrete Hartley Transforms (DHTs) @cindex discrete Hartley transform @cindex DHT The discrete Hartley transform (DHT) of a 1d real array @math{X} of size @math{n} computes a real array @math{Y} of the same size, where: @tex $$ Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)]. $$ @end tex @ifinfo @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)]. @end ifinfo @html
.
@end html @cindex normalization FFTW computes an unnormalized transform, in that there is no coefficient in front of the summation in the DHT. In other words, applying the transform twice (the DHT is its own inverse) will multiply the input by @math{n}. @c =========> @node Multi-dimensional Transforms, , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes @subsection Multi-dimensional Transforms The multi-dimensional transforms of FFTW, in general, compute simply the separable product of the given 1d transform along each dimension of the array. Since each of these transforms is unnormalized, computing the forward followed by the backward/inverse multi-dimensional transform will result in the original array scaled by the product of the normalization factors for each dimension (e.g. the product of the dimension sizes, for a multi-dimensional DFT). @tex As an explicit example, consider the following exact mathematical definition of our multi-dimensional DFT. Let $X$ be a $d$-dimensional complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0 \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$. Let also $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d \}$. The forward transform computes a complex array~$Y$, whose structure is the same as that of~$X$, defined by $$ Y[k_1, k_2, \ldots, k_d] = \sum_{j_1 = 0}^{n_1 - 1} \sum_{j_2 = 0}^{n_2 - 1} \cdots \sum_{j_d = 0}^{n_d - 1} X[j_1, j_2, \ldots, j_d] \omega_1^{-j_1 k_1} \omega_2^{-j_2 k_2} \cdots \omega_d^{-j_d k_d} \ . $$ The backward transform computes $$ Y[k_1, k_2, \ldots, k_d] = \sum_{j_1 = 0}^{n_1 - 1} \sum_{j_2 = 0}^{n_2 - 1} \cdots \sum_{j_d = 0}^{n_d - 1} X[j_1, j_2, \ldots, j_d] \omega_1^{j_1 k_1} \omega_2^{j_2 k_2} \cdots \omega_d^{j_d k_d} \ . $$ Computing the forward transform followed by the backward transform will multiply the array by $\prod_{s=1}^{d} n_d$. @end tex @cindex r2c The definition of FFTW's multi-dimensional DFT of real data (r2c) deserves special attention. In this case, we logically compute the full multi-dimensional DFT of the input data; since the input data are purely real, the output data have the Hermitian symmetry and therefore only one non-redundant half need be stored. More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array @tex $Y[k_0, k_1, \ldots, k_{d-1}]$ @end tex @html Y[k0, k1, ..., kd-1] @end html @ifinfo Y[k[0], k[1], ..., k[d-1]] @end ifinfo has the symmetry: @tex $$ Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^* $$ @end tex @html Y[k0, k1, ..., kd-1] = Y[n0 - k0, n1 - k1, ..., nd-1 - kd-1]* @end html @ifinfo Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]* @end ifinfo (where each dimension is periodic). Because of this symmetry, we only store the @tex $k_{d-1} = 0 \cdots n_{d-1}/2$ @end tex @html kd-1 = 0...nd-1/2+1 @end html @ifinfo k[d-1] = 0...n[d-1]/2 @end ifinfo elements of the @emph{last} dimension (division by @math{2} is rounded down). (We could instead have cut any other dimension in half, but the last dimension proved computationally convenient.) This results in the peculiar array format described in more detail by @ref{Real-data DFT Array Format}. The multi-dimensional c2r transform is simply the unnormalized inverse of the r2c transform. i.e. it is the same as FFTW's complex backward multi-dimensional DFT, operating on a Hermitian input array in the peculiar format mentioned above and outputting a real array (since the DFT output is purely real). We should remind the user that the separable product of 1d transforms along each dimension, as computed by FFTW, is not always the same thing as the usual multi-dimensional transform. A multi-dimensional @code{R2HC} (or @code{HC2R}) transform is not identical to the multi-dimensional DFT, requiring some post-processing to combine the requisite real and imaginary parts, as was described in @ref{The Halfcomplex-format DFT}. Likewise, FFTW's multidimensional @code{FFTW_DHT} r2r transform is not the same thing as the logical multi-dimensional discrete Hartley transform defined in the literature, as discussed in @ref{The Discrete Hartley Transform}.