Environment Modules and Compiling Programs
This tutorial is meant to provide you with the basics of compiling programs for the HPC environment on the Arya, Hodor, and Talon clusters. This tutorial is meant for user developed programs written in C, C++, or FORTRAN.
If you wish to install third-party programs, libraries, or programming environments, it is suggested that you contact the CRC to discuss your software requirements.
Serial vs Parallel Computing
Traditional compute program are written as serial computation programs. High Performance Computing cluster systems increase performance through parallel computing.
Serial computation basically means that a computer program will execute its program one operation at a time in sequential order. Whereas parallel computation occurs by executing two or more operations simultaneously.
While an in-depth discussion of serial vs parallel computing is beyond the scope of this tutorial, those who feel that they would benefit from a more comprehensive explanation of serial vs parallel programming can check out the Wikipedia article on the subject http://en.wikipedia.org/wiki/Parallel_computing
Basic Serial Compiling
The following GNU compilers ( http://gcc.gnu.org ) are automatically available to users:
For the C language: gcc For the C++ language: g++ For the FORTRAN language: gfortran
Also the default RHEL (RedHat Enterprise Linux) compilers are also available:
For the C language: cc For the C++ language: c++
Any of the following commands will create a helloworld.x
executable. Source files can be found in the /share/apps/examples
folder. The format for compiling is:
<compiler (gcc, cc, g++, etc..) > <output flag (–o)> <name of executable (something.x)
<source file>
Compiler Commands - C Language (Source usually has a .c
file extension:
gcc –o helloworld.x helloworld.c cc –o helloworld.x helloworld.c
Compiler Commands – C++ Language (Source usually has a .cxx
or .cpp
file extension:
g++ -o helloworld.x helloworld.cxx c++ -o helloworld.x helloworld.cxx
Compiler Commands – FORTRAN Language (Source usually has a .f
or .f90
or similar file extension:
gfortran –o helloworld.x helloworld.f
If you are new to programming/compiling, we suggest that you copy one or more of the source files to your Hodor home directory, and then run one of the compiler commands above. For instance, if using the C Language:
cd ~/ cp /share/apps/examples/helloworld.c helloworld.c gcc –o helloworld.x helloworld.c
To run the program you just compiled, you type the following at the command prompt and then press return:
./helloworld.x
Doing so should cause your program to print the following to your screen:
Hello World!
The above information only scratches the surface when it comes to compiling programs. Further information about compiling serial programs is beyond the scope of this tutorial. It is expected that the user will seek out more information elsewhere through books or online tutorials should it be required.
Note:
If you want to run a program on the cluster, it must be compiled on the cluster.
Contact the CRC should any further assistance in this area be required.
Environment Variables and Modulefiles
In order to compile and run parallel programs you generally need specific libraries that enable this. Because these can be difficult to get working properly we have implemented environment modules to help make this process easier.
- We have a tutorial page that goes into more detail with the environment modules HERE.
Example: In addition to the default GNU and RHEL compilers, we have also installed the Intel Studio 2013 compiler suite on Arya, Hodor and Talon. However, unlike the default compilers, in order to use the Intel compilers, users must update their environment variables in order to gain access to them.
The easiest way to do this is to load the “intel64-compiler-2013” modulefile, as shown below:
module load intel64-compiler-2013
Loading this modulefile updates a number of your account’s environment (env) variables,
including adding the Intel MPI libraries and Intel MPI compilers to your PATH
and LD_LIBRARY_PATH
env variables, and a number of other variables not discussed here. Most importantly,
it adds the Intel C, C++, and FORTRAN compilers to your PATH
variable.
Loading this modulefile now provides your account access to the Intel compilers on the command line just as the default compilers are available to you:
Intel Compiler Commands - C Language (Source usually has a .c
file extension:
icc –o helloworld.x helloworld.c
Intel Compiler Commands – C++ Language (Source usually has a .cxx
or .cpp
file extension:
icpc -o helloworld.x helloworld.cxx
Intel Compiler Commands – FORTRAN Language (Source usually has a .f
or .f90
or similar extension:
ifort –o helloworld.x helloworld.f
As before, we suggest that you trying compiling one or more of the example source code using the Intel compiler commands above. For instance, if using the C Language:
cd ~/ cp /share/apps/examples/helloworld.c helloworld.c icc –o helloworld.x helloworld.c
To run the program you just compiled, you type the following at the command prompt and then press return:
./helloworld.x
Doing so should cause your program to print the following to your screen:
Hello World!
After you are done using the Intel compilers, you can revert to your pre-intel env variable settings by unloading the Intel modulefile, by typing the following on the command line and then pressing return/enter:
module unload intel64-compiler-2013
Compiling For Parallel Programs
There are a variety of technologies available that can be used to write a parallel program: GPU, GPU Direct, UPC, Intel MIC, MVAPICH MPI, Intel MPI, OpenMPI, and some combination of two or more of these technologies. While the clusters are capable of running programs developed using any of these technologies, this tutorial will focus on the 3 versions of MPI (Intel MPI, MVAPICH, and OpenMPI) available on this system.
MPI (Message Passing Interface)
MPI is the technology you should use when you wish to run your program in parallel on multiple cluster compute nodes simultaneously.
Arya, Hodor and Talon have four different versions of MPI installed on each of the clusters: MVAPICh2-x, OpenMPI, Intel MPI, and Intel Mic MPI. Use of Intel Mic MPI will be covered in a future advanced parallel tutorial.
Compiling an MPI program is relatively easy. However writing an MPI-based parallel program takes more work.
Before you compile for MPI, you must first load the appropriate module.
The MPI modules that are currently available are as follows:
- intel/mpi/32/16.0.4/2016.4.258
- intel/mpi/64/5.1.3/2016.4.258
- intel/mpi/mic/5.1.3/2016.4.258
- mpich/ge/gcc/64/3.2rc2
- mpich/ge/open64/64/3.2rc2
- mvapich2/gcc/64/2.2rc1
- mvapich2/open64/64/2.2rc1
- openmpi/gcc/64/1.10.1
- openmpi/open64/64/1.10.1
Once the appropriate module is loaded, compiling for C, C++, and FORTRAN is the same for each modules:
Simple Compiling MPI for the C Language:
mpicc –o someprog.x someprog.c
Simple Compiling MPI for the C++ Language:
mpicxx –o someprog.x someprog.cxx
Simple Compiling MPI for the FORTRAN Language:
mpif77 –o someprog.x someprog.f (if Fortran 77 – all three MPI’s) mpif90 –o someprog.x someprog.f90 (if Fortran 90 – all three MPI’s) mpiifort –o someprog.x someprog.f (general FORTRAN – Intel Only)
So you’re probably wondering, “Which version of MPI should I use?”
OpenMPI - we generally recommend that you do NOT use OpenMPI if you can help it. OpenMPI is not as fast as MVAPICH and Intel MPI on our InfinBand network.
Why do we have OpenMPI then if it is not as fast? We maintain it mostly because some alternative programming environments such the R Language and Matlab require OpenMPI if users wish to use those environments with MPI.
MVAPICh2-x – use this version of MPI if:
- You prefer to use the GNU Compilers
- You intend on investigating the use of GPUs or GPU Direct, but not Intel MIC technology.
- You prefer to use a math library other than MKL.
Intel MPI – use this version of MPI if:
- You prefer to use Intel compilers
- You intend on investigating the use of Intel MIC technology.
- You wish to incorporate the Intel Math Kernel Library (MKL) in your code.
- You with to use a FORTRAN compiler other than the GNU gfortran compiler.
A tutorial on how to write an MPI program is outside the scope of this tutorial. Please refer to the following link for general MPI programming information (ignore compiler info): https://computing.llnl.gov/tutorials/mpi
We offer one piece of advice, however. Use C version of MPI in both your C, and C++ code.
Please see the CRC tutorial Job Scheduling and Slurm Scripts on how to run MPI jobs on the Arya/Hodor/Talon clusters.