This is the README file for GPU version of GMP-ECM. The GPU code will only work with NVIDIA GPU of compute capability greater than 3.5. Table of contents of this file 1. How to enable GPU code in GMP-ECM 2. Basic Usage 3. Advanced Usage 4. Known issues ############################################################################## 1. How to enable GPU code in GMP-ECM By default the GPU code is not enabled, to enable it you have to follow the instructions of INSTALL-ecm until the 'configure' step. Then add "--enable-gpu" and "--with-cgbn-include=...". CGBN headers are required for GPU builds after f81ddf3b. "--with-cgbn-includes" should point to the CGBN include directory generally "../CGBN/include/cgbn". CGBN is a "CUDA Accelerated Multiple Precision Arithmetic (Big Num)" library available at https://github.com/NVlabs/CGBN $ ./configure --enable-gpu --with-cgbn-include=/PATH/DIR/CGBN/include/cgbn This will configure the code for NVIDIA GPU for all compute capabilities between 3.5 and 9.0 known to the nvcc compiler. To enable only a single compute capability you can set '--enable-gpu=XX' $ ./configure --enable-gpu=61 [other options] By default, GMP-ECM will look for cuda.h in the default header directories, but you can specify another directory, such as /opt/cuda, with: $ ./configure --enable-gpu --with-cuda=/opt/cuda By default, GMP-ECM will look for the nvcc compiler in $PATH, but you can specify another directory: $ ./configure --enable-gpu --with-cuda-bin=/PATH/DIR For finer control you can specify the location of cuda.h as follows: $ ./configure --enable-gpu --with-cuda-include=/PATH/DIR By default, GMP-ECM will look for CUDA the default library directories, but you can specify another directory: $ ./configure --enable-gpu --with-cuda-lib=/PATH/DIR Some versions of CUDA are not compatible with recent versions of gcc. To specify which C compiler is called by the CUDA compiler nvcc, type: $ ./configure --enable-gpu --with-cuda-compiler=/PATH/DIR The value of this parameter is directly passed to nvcc via the option "--compiler-bindir". By default, GMP-ECM lets nvcc choose what C compiler it uses. If you get errors about "cuda.h: present but cannot be compiled" Try setting CC to a know good gcc, you may need to use and --with-cuda-compiler $ ./configure --enable-gpu CC=gcc-8 Then, to compile the code, type: $ make And to check that the program works correctly, type: $ make check Additional randomized checks can be run with $ sage check_gpuecm.sage ./ecm For failing kernels some additional information may be present in cuda-memcheck $ echo "(2^997-1)" | cuda-memcheck ./ecm -gpu -gpucurves 4096 -v 16000 0 ############################################################################## 2. Basic Usage To use your GPU for step 1, just add the -gpu option: $ echo "(2^835+1)/33" | ./ecm -gpu 1e4 It will compute step 1 on the GPU, and then perform step 2 on the CPU (not in parallel). The only parametrization compatible with GPU code is "-param 3". You can save the end of step 1 with "-save" and then load the file to execute step 2. But you cannot resume to continue step 1 with a bigger B1. The options "-mpzmod", "-modmuln", "-redc", "-nobase2" and "-base2" have no effect on step 1, if the "-gpu" option is activated, but will apply for step 2. ############################################################################## 3. Advanced Usage The option "-gpudevice n" forces the GPU code to be executed on device n. Nvidia tool "nvidia-smi" can be used to know to which number is associated a GPU. Moreover, you can use GMP-ECM option "-v" (verbose) to see the properties of the GPU on which the code is run. The option "-gpucurves n" forces GMP-ECM to compute n curves in parallel on the GPU. By default, the number of curves is choose to fill completly the GPU. The number of curves must be a multiple of the number of curves by multiprocessors (which depend on the GPU) or else it would be rounded to the next multiple. Throughput for determining CGBN kernel size and "-gpucurves" can be tested using the provided "gpu_throughput_test.sh" script. This takes optional ecm command and number of curves. $ ./gpu_throughput_test.sh [ECM_CMD] [GPUCURVES] The CGBN based GPU code can be easily changed to support inputs from 256-32768 bits. Several different sized CUDA kernels are defined in cgbn_stage1.cu. These kernels are fixed at compile type. A log message is printed if recompiling with a different sized kernel would likely speed up execution. See the comment "Compiling custom kernel" in cgbn_stage1.cu near line 680. Each additional kernel increases compile time and binary size so only two are included in development mode. ############################################################################## 4. Known issues If you get "Error msg: forward compatibility was attempted on non supported HW" or "error: 'cuda.h' and 'cudart' library have different versions", then you can look at https://stackoverflow.com/questions/43022843/nvidia-nvml-driver-library-version-mismatch/45319156#45319156. In general the best solution is to restart the machine. ############################################################################## Please report to sethtroisi (by email or on mersenneforum) any problems, bugs, or observations.