Parallel Computing Lecture 7
Topics
- Optimization toolbox
- Software pipelining
- Project 1 assignment
- Lecture 7 slides (PDF)
Project 1
The files for Project 1 are on the RZ cluster in the ~cc046be/teaching/para/pr1-2013 directory, and also posted here (right-click to save to disk):
Note that mien.fine and the mrng... files are not needed for the projection task, but only for plotting of the results.
Endianness Conversion
The files are again big-endian. For most Fortran compilers, one can use a command-line option as already explained in Lecture 3.
When using a compiler without such option, in order to swap bytes after reading and before writing the big-endian files, one can use the following C function, as implemented in ~cc046be/teaching/para/pr1-2013/swapbytes.c:
#include <string.h>
#include <stdlib.h>
void swapbytes_(char *array, int *nelem, int *elsize)
{
register int sizet, sizem, i, j;
char *bytea, *byteb;
sizet = *elsize; sizem = sizet - 1;
bytea = malloc(sizet); byteb = malloc(sizet);
for (i = 0; i < *nelem; i++) {
memcpy((void *)bytea, (void *)(array+i*sizet), sizet);
for (j = 0; j < sizet; j++) byteb[ j ] = bytea[sizem - j];
memcpy((void *)(array+i*sizet), (void *)byteb, sizet);
}
free(bytea); free(byteb);
return;
}
One can either compile this oneself, or use the precompiled object file in ~cc046be/teaching/para/pr1-2013/swapbytes.o, e.g., like this:
ifort -o project project.f90 ~cc046be/teaching/para/pr1-2013/swapbytes.owhere
project.f90 is the projection code.
The program should call the byte-swapping function just after reading an array, and just before writing the result array, like this:
open(unit=1,file="mxyz.coarse",recl=nsd*nn*8,access="direct")
read(unit=1,rec=1) ((x(isd,in), isd=1,nsd), in=1,nn)
close(unit=1)
call swapbytes(x, nsd*nn, 8)
The first argument is the array to be byte-swapped, the second is the number of entries (double precision or integer), and the third is the entry size in bytes (8 for double precision and 4 for integer). Don't include I/O or byte-swapping in your timings.
Record Length in Fortran
The Fortran code examples for reading and writing binary data in the lecture slides assume that the record length (recl=...) is given in bytes. Intel Composer XE 2013 installed on the RZ cluster interprets record lengths in units of 4 bytes instead. To get the behavior consistent with the notes, use -assume byterecl compiler option.
Editing
At this stage, you may face for the first time the issue of editing program files that reside remotely on the computing cluster. Old-timers use arcane text-only editors such as nano, vi or emacs which are present on most Linux/Unix systems and don't require graphical interface. If you use X Windows client (built-in on Linux and Mac OS X; free implementations available on Windows) to connect to the cluster, a number of graphical editors are also available, such as kate or nedit. You can also use Remote Desktop Connection client to connect to RZ Windows HPC server and get visual access to your RZ files.
Visualization
Current version of Pager is available for download and on the RZ cluster; more information is here.




