June 23, 2003--Apple announces G5
June 26, 2003--VA Tech contacts Apple. Deal sealed with Apple in a few days. Apple
is stunned :-). Thought that Dr. Vanadarajan must be a Mac fanatic. Turns out
he never touched a Mac before, but is certainly proficient in using Mach and Linux/Unix.
VA Tech actually ordered the machines via the Apple Store mechanism!
September 5-11, 2003---G5's arrive
September 23, 2003--facility begins preliminary operations
October 1 through mid November will be performance optimizations
Experimental runs by users can begin today.
Full production use by the start of 2004.
2 million for upgraded the facilities in which the system is housed.
5.2 million for hardware (besides computers, includes cards, cables, storage, etc)
Current facility scheduled to be followed by a 2nd system in a new building in 2006.
General design criteria:
--Major factor is Performance/Price
--64 bit design (32 bit systems need not apply)
--Benchmarks will depend on double-precision floating point--Altivec not being used
in this case.
--Connectivity to Internet 1, Internet 2 (Abilene) and soon into NLR (National Lambda
--High bandwidth with ultralow latency communication
--Infiniband switched network, 20 GB/sec/port full duplex, latency of less than 10
microseconds on top of MPI.
Some vendors proposed a variety of "turnkey" systems. This drove up the
cost of some bids into the range of 9-12 million dollars which was well beyond the
Dell with Itanium 2----lost on processor and system cost and to overall performance
IBM with Opteron---lost on performance and overall system cost
IBM with PowerPC 970--won on performance, but lost in delivery time (January 2004)
and overall system cost
Sun with SPARC--lost on performance and cost
Apple with PowerPC 970--won on performance and overall system cost.
Opteron apparently does not support the "fused multiply-add" (I may have
the spelling wrong) function which gives G5 an edge in floating point performance.
As such, G5 can outpace an Opteron by a factor of 2 in floating point.
Itanium 2 apparently gives a GEMM efficiency (see Results section below) as much
as 15 percent better than G5 right now. However it is very expensive, and also loses
whatever GEMM efficiency advantage it has due to other things like its slower clock
speed. That is definitely an ironics twist. :-)
--Each G5 machine has a stock install of OSX 10.2.7
--Mellanox Infiniband drivers
--MPI implemented using MVAPICH from D.K. Panda's group at Ohio State University.
Code ported from Linux with additions of message caching and dynamic memory management.
--Cache optimized memory manager for scientific apps written for OSX as a KEXT (written
--Scaleable job starting system for MVAPICH (written in-house).
--Deja Vu as a system for fault tolerance ported to the G5. Intended to be separate
for ordinary application logic. (written in-house)
For C and C++, IBM xlc and GCC 3.3
For Fortran, IBM xlf and NAGWare
--Mellanox driver version 1 started in July and finished in mid-august. Subsequent
tweaks have improved things by around 10 percent.
--Benchmarked using LinPack
--G5 solved a system of equations at N = 500K
--dense matrix operations
--main phase is LU decomposition. Gaussian elimination with partial row pivoting
--back solution follows at a lower order 0(n^2)
--Used BLAS libraries
--Core routines--matrix multiply (GEMM) optimized by Kazushiga Goto in Japan. 84.1
percent efficiency at this time. Apple's veclib framework also used.
AND THE CURRENT RESULTS AS OF 10/28/2003 ARE .................
So on the current list, this puts them at number 3.
Immediate future plans:
Upgrade G5's to Panther in the next couple of weeks. All codes compile fine under
Along with some other optimization tricks, anticipating for at least another 10 percent
improvement in performance.
Expecting to make their MPI enhancements and in-house software open source. For
the Infiniband drivers, Dr. Varadarajan could not speak for them, but is hopeful
that those drivers will be made available as open source as well. But that is Mellanox's
http://www.computing.vt.edu/ (Virginia Tech Project: Terascale
http://don.cc.vt.edu/ (Pictures: Terascale Cluster)
http://www.computerweekly.com/ (Apple chosen for supercomputing
http://macslash.org/ (TenCon Keynote - Dr. Srinidhi
http://www.wired.com/ (Mac Supercomputer Just Got Faster)