Code Development
A*CRC users who work on code development can access a comprehensive set of software tools available on each of the machines. Here we briefly list the available tools for optimizing and tuning the applications, including compilers, editors, debugging and computing tools, as well as MPI, OpenMP and shmem libraries. We also list the common tools that are not installed or used by A*CRC users at present, but could be implemented on request.

File editors
All A*CRC systems are Linux or AIX based, hence all standard Linux/UNIX text editors can be used in code writing and file editing.

The typical text editors available for use on our systems or on native desktop users’ systems are:

Operating System
Text Editors (common)
Description
Linux/UNIX
ed
sed
vi

edit
Emacs
XEmacs

NEdit
xedit
kwrite, kedit

nano
standard line editor
a stream editor for batch processing of files
a visual editor; full screen; uses ed/ex line-mode
commands for global file editing
a simple line editor
GNU project Emacs editor
a highly customizable text editor (Emacs: the next generation)
an X-windows text editor
a simple text editor for X-windows
editors bundled with theKDE(Linux) desktop environment
new generation file editor
MacOS
TextEdit
TextEdit+
(vi,ed,sed, …)
basic and Enhanced MacOS text editor versions

MacOS versions of UNIX editors
Windows
Notepad
Wordpad
pure text screen editor, outputs to any extension
basic formatting editor with RTF output

The comprehensive list of most text editors can be found at Wikipedia’s List of text editors, and their features are compared at Comparison of text editorspage.


Compilers
GNU, Intel, IBM and Pathscale compilers, with associated math libraries, are installed on the A*CRC machines. Both the open-source standard GNU tools as well as commercial C/C++ and Fortran compilers with OpenMP and MPI are installed on all our systems. Open SourceGNU Compilersfor C/C++ and Fortran are also installed on all of our systems.

If users are interested in other compilers such as Absoft, PGI or others on Linux, MacOS or Windows operating systems, they can inform A*CRC for possible assistance as well.
Commercial Compilers installed are:

Compiler
Computer System
Intel C/C++ and Fortran
all Intel X86 and Itanium and AMD X86
PathScale Compiler Suite
all AMD X86
IBM XLC AIX C/C++ and XLF Fortran
all IBM POWER
Fujitsu C/C++ and Fortran only on Fujitsu system


Installed OS versions and compilers on each of A*CRC systems are:

Computer
OS
gcc
PathScale
Intel
IBM
Fujitsu
FUJI
RH Linux 5.4
4.1.2
-
11

3.3


Compilers of C/C++ and Fortran are invoked on each of A*CRC computers by the following calls:

SYSTEM
Language
Compiler call
FUJI
C/C++
FCC, fcc (Fujitsu)
icc, icpc, mpiicc, mpiicpc (Intel)
gcc, g++ (GNU)
Fortran
frt (Fujitsu)
ifort, mpiifort (Intel)
g77, gfortran (GNU)


Computing tools
At this stage A*CRC has only a rather restricted suite of code development tools installed on the systems. GDB are installed on all systems and IDB are on all Intel systems.Total View is available on some systems as well.

A*CRC is working on implementing a much more comprehensive set of code development tools for profiling, tracing, threading, and memory checks. A*CRC would welcome users’ feedback and suggestions on the most desirable and useful tools to be installed as first preference.

The table below lists representative and most common code development tools is use in many HPC centres.


Commonly Used Code Development Tools
Tool Function
Representative Example of Tools
Compilers and Preprocessors
Compilers currently installed on A*CRC systems
Debugging
TotalView
gdb
DDT
STAT
IDB
Thread Checker
Memory
Memcheck
TotalView MemoryScape
Profiling
gprof
Open|SpeedShop
TAU
mpiP
OPT
Tracing
Vampir
VampirTrace
Open|SpeedShop
TAU
Correctness
Marmot
MPIcheck
Code Development
MPIMap
HPCtoolkit
PIN
PAPI
FASThread
Tool Infrastructures
DPCL
MRNet
Dyninst
MPE Jump Shot
OTF
Launchmon
Performance Analysis
TAU
Open|SpeedShop


MPI
Message Passing Interface (MPI) Application Programming Interface (API) is the dominant programming approach on modern clusters and is supported on all our systems. More on MPI, including the most updated MPI-2 specification, is available at the MPI Forum. For the examples of commands invoking MPI see calls to parallel compilations examples on A*CRC systems page.

OpenMP
OpenMP is the incumbent standard for shared-memory SMP and NUMA parallel programming on multiprocessor systems, in local memory cluster systems used mostly for jobs operating within the node. More information on OpenMP can be found here and Compunity. For the examples of commands invoking OpenMP see calls to parallel compilations examples on A*CRC systems page.

Shmem
SHMEM is the ultrafast native ‘shared virtual memory’ communication library on selected multiprocessor and cluster systems, with substantial performance gains over MPI. A*CRC has several systems, both SMP and Quadrics-interconnected clusters, that support accelerated shmem in hardware.. For the examples of commands invoking shmem see calls to parallel compilations examples on A*CRC systems page.

Parallel Compilation call examples

MPI 
Message-passing programming coordinates multiple computing elements (processes) through primitives (such as sending a message to one or more other computing elements, receiving a message from a computing element, and synchronizing with other computing elements) so that a process can exchange information (such as an array) with a second process. The synchronizing primitives allow two or more processes to ensure that each process is "ready" for the next step in a parallel algorithm. Beside the target system, the actual invoking call also depends on the schedulers and resource managers used.

OpenMP
OpenMP has emerged as an important model and language extension for shared-memory parallel programming. OpenMP is a collection of compiler directives and library routines used to write portable class="sortable" parallel programs for shared-memory architectures. Writing efficient parallel programs for NUMA architectures, which have characteristics of both shared-memory and distributed-memory architectures, requires that a programmer control the placement of data in memory and the placement of computations that operate on that data. Optimal performance is obtained when computations occur on processors that have fast access to the data needed by those computations.

EXAMPLE – valid on all OpenMP systems
export OMP_NUM_THREADS=n {where n is the number of threads you wish to spawn}
./[your_executable class="sortable"_file-name].out

Shmem 
SHMEM is the ultrafast native ‘shared virtual memory’ communication library on several Cray, SGI multiprocessor machines as well as Quadrics and Dolphin cluster interconnects, with substantial performance gains over MPI. A shmem_put() call allows a node to write data directly into user space on another node, and a shmem_get() function allows it to get data from another node. Both occur without the cooperation of the second node. A*Star has several systems, both SMP and Quadrics-interconnected clusters, that support accelerated shmem in hardware. A Shmem programming manual is available here

EXAMPLE
Here is the command line interface for the commonly used remote read shmem program, sping.

sping -n number[k|K|m|M] -eh nwords [maxWords [incWords]]

The options for the programs are:
-n number[k|K|m|M]

Specify the number of times to ping. The number may have a k or
an m appended to it (or their upper case equivalents) to denote
multiples of 1024 and 1,048,576 respectively. By default, the
program pings 10,000 times.
-e Instructs every process to print their timing statistics.
Programming Examples 3-1
Header Files and Variables
-h Displays the list of options.
nwords [maxWords [incWords]]

nwords specifies to sping how many words there are in each
packet. If maxWords is given, it speci_es a maximum number of
words to send in each packet and invokes the following behavior.
After each repetitions (as specified with the -n option), the packet
size is increased by incWords (the default is a doubling in size) and
another set of repetitions is performed until the packet size exceeds
maxWords.