PC Clusters for Computational Science - Theory and Practice
This session was held at PC2000, a meeting organized by the Division
of Computational Physics of the American Physical Society. The meeting
was held in conjunction with the March, 2000 meeting of the APS in Minneapolis.
- Networking Options for
Beowulf Clusters (PowerPoint)
- Thomas L. Sterling, Caltech/JPL
- Beowulf clusters are parallel supercomputers built from commodity microprocessors.
Their performance is dependent upon good network connectivity. Currently,
Fast Ethernet, Gigabit Ethernet and Myrinet are the main options for network
hardware. Bandwidth and latency achieved with these hardware options and
their associated software will be reviewed. Near term future options, such
as SIO, will also be discussed.
What's the Best Node for Your Cluster? (Available in various formats)
- Rick Stevens Argonne National Laboratory
- A well designed cluster requires a node containing the most appropriate
balance of resources for the problems it will be solving. For many scientific
problems, memory bandwidth or peripheral bandwidth can be a severe bottleneck,
and spending extra money on a faster processor will not increase performance
significantly. This talk will cover the options available for cluster nodes
including processors, memory speeds and standards, and peripheral busses.
There will also be a discussion of when SMP nodes should be used, and how
many processors can be accomodated per node.
Lattice QCD on Linux and NT Clusters (HTML)
- Kostas Orginos, University of Arizona
- Calculations of hadronic properties in Quantum
Chromodynamics is a non-perturbative task. Lattice QCD is
the only available non-perturbative technique capable of
doing such computations. Although the algorithms and the
machines have significantly improved during the last years,
it is still a very challenging and interesting problem to
simulate QCD. Interesting because Lattice QCD is now in the
position to make accurate enough computations to guide
experiments to the discovery of new physics, and challenging
because of the massive computer power needed for such
computations. Up to now such computing power was only
available through very expensive supercomputers, sometimes
especially build for lattice QCD. Recently, the increasing
power of the microprocessors used in personal computers, the
dropping prices of commodity hardware, and the advances in
network technology had made it possible to build relatively
cheap clusters of workstations capable of doing large scale
Lattice QCD simulations. The MILC collaboration has
experience in both building and using such machines. Two
Linux clusters have been built, one by Steven Gottlieb in
Indiana and one by Carleton DeTar in Utah. We have also been
using the NT cluster built at NCSA. The portable MPI based
MILC code runs efficiently, without any fine tuning, and
shows good scalability on these machines. Furthermore, the
cost of $10-$25 per delivered Mflop makes them very
attractive compared to the $190-$450 per delivered Mflop
of the commercial supercomputers.
Scientific Applications on Workstation
Clusters vs. Supercomputers (PowerPoint)
- Dave Turner, Ames Laboratory, Iowa State University
- The idea of building a 'supercomputer' by connecting many workstations
or PCs using a fast network is clearly attractive for several reasons.
The cost of such a cluster computer can be an order of magnitude cheaper
than a traditional multiprocessor machine while providing the same computational
power. Workstation clusters can also be grown over time by simply adding
more machines. It is often difficult to efficiently use the computational
power of any multiprocessor system due to the limited interprocessor communication
rate. It is even more difficult for cluster computers where the bandwidth
is lower and the latency higher than for traditional MPP systems. This
talk will compare the performance of many applications having different
computational and communication characteristics on a wide variety of MPP
systems and cluster computers. These include a Cray T3E, an Intel Paragon,
SGI SMP systems, and clusters of PCs connected by Fast Ethernet, an Alpha
cluster connected by Gigabit Ethernet, and a cluster of dual-processor
IBM Power3 systems connected by Gigabit Ethernet. The applications used
for this analysis cover a broad range as far as how demanding they are
on the communication system. They include classical and tight-binding molecular
dynamics codes, an ab initio plane wave program, and a finite difference
electromagnetic wave propagation code. This talk will conclude with a discussion
of the work that is being done to overcome some of the current limitations
of cluster computers.
Also available from Dave Turner is a short PowerPoint presentation on
Gigabit Ethernet.
The Avalon Beowulf Cluster: A Dependable
Tool for Scientific Simulation
- Michael Warren, Los Alamos National Laboratory
- Avalon is a 140 processor Alpha/Linux Beowulf cluster constructed entirely
from commodity personal computer technology and freely available software.
Computational Physics simulations performed on Avalon resulted in the award
of a 1998 Gordon Bell price/performance prize for significant achievement
in parallel processing. Avalon ranked as the 113th fastest computer in
the world on the November 1998 TOP500 list, obtaining a result of 48.6
Gigaflops on the parallel Linpack benchmark.
- The price of hardware and final assembly labor for Avalon totalled
$313,000 dollars in the fall of 1998. Avalon currently provides over 15,000
node-hours of production computing time per week, split among about 10
production users. Obtaining an equivalent amount of computing through Los
Alamos institutional sources would cost a minimum of $30,000 per week.
The machine also supports code development for another 60 users. Significant
simulations have been performed on Avalon in fields of astrophysics, molecular
dynamics, nonlinear dynamics as well as other areas. The largest single
simulation performed on Avalon computed a total of over 1016
floating point operations.
- We will describe some of the applications which have obtained good
performance on Avalon, and their characteristics. Our goal has been to
provide dependable cycles for computational physics, and not to perform
research into clustered computing systems. One of the main lessons learned
from the Avalon project is that the details of the hardware are not nearly
as important as the attitudes and expectations of the users and managers
of the hardware.
If you have comments or suggestions, email Steven Gottlieb
at sg@indiana.edu