Performance Analysis of Parallel Applications on an SGI Machine: CS213 Case Study | Assignments Computer Science

CS213 Homework 2

SPLASH Performance Profile on SGI Machine

Danhua Guo

dguo@cs.ucr.edu

1 Environment

SGI server: 64X Intel Itanium 2 Processor with 64GB RAM

2 Benchmark

SPLASH2 is used as the benchmark for this experiment. More specifically, I chose 5

applications in the benchmark. The functionality of each application is described as

follows:

Contiguous partition allocation program is one of the two applications provided with the

OCEAN package. This implementation (contained in the contiguous_partitions

subdirectory) implements the grids to be operated on with 3-dimensional arrays. The first

dimension specifies the processor which owns the partition, and the second and third

dimensions specify the x and y offset within a partition. This data structure allows

partitions to be allocated contiguously and entirely in the local memory of processors that

"own" them, thus enhancing data locality properties.

BARNES application implements the Barnes-Hut method to simulate the interaction of a

system of bodies (N-body problem). The SPLASH-2 implementation allows for multiple

particles to be stored in each leaf cell of the space partition.

FMM application implements a parallel adaptive Fast Multipole Method to simulate the

interaction of a system of bodies (N-body problem).

WATER-NSQUARED is an improvement over the original Water code in SPLASH, but is

mostly the same. The best source of descriptive information therefore is the original

SPLASH report. The main change is that the locking strategy around the updates to the

water accelerations (in interf.C) is improved: a process updates a local copy of the

relevant particle accelerations, and then accumulates into the shared copy once at the end.

WATER-SPATIAL solves the same molecular dynamics N-body problem as the original

Water code in SPLASH (which is called WATER-NSQUARED in SPLASH-2), but uses a

different algorithm. In particular, it imposes a 3-d spatial data structure on the cubical

domain, resulting in a 3-d grid of boxes. Every box contains a linked list of the

molecules currently in that box (in the current time-step). The advantage of the spatial

grid is that a process that owns a box in the grid has to look at only its neighboring boxes

for molecules that might be within the cutoff radius from a molecule in the box it owns.

This makes the algorithm O(n) instead of O(n^2). For small problems (upto several

hundred to a couple of thousand molecules) the overhead of the spatial data structure is

not justified and WATER-NSQUARED might solve the problem faster. But for large

Performance Analysis of Parallel Applications on an SGI Machine: CS213 Case Study, Assignments of Computer Science

Related documents

Partial preview of the text

Download Performance Analysis of Parallel Applications on an SGI Machine: CS213 Case Study and more Assignments Computer Science in PDF only on Docsity!

CS213 Homework 2

SPLASH Performance Profile on SGI Machine

Danhua Guo

dguo@cs.ucr.edu

1 Environment

2 Benchmark

3 Performance Metrics

4 Profiling Tool

5 Result

5.2 BARNES

5.3 FMM

5.4 WATER-NSQURED and WATER-SPATIAL