Gordon Bell Performance Evaluation
           
Event Type Start Time End Time Rm # Chair  

 

Paper 3:30PM 4:00PM 36-37 David Bailey (LBNL)
 
Title:

Performance evaluation and tuning of GRAPE-6 --- towards 40 "real" Tflops
  Speakers/Presenter:
Junichiro Makino (Department of Astronomy, School of Science, University of Tokyo), Eiichiro Kokubo (National Astronomical Observatory of Japan), Toshiyuki Fukushige (Department of General System Studies, College of Arts and Sciences, University of Tokyo), Hiroshi Daisaka (Department of Astronomy, School of Science, University of Tokyo)

 

Paper 4:00PM 4:30PM 36-37 David Bailey (LBNL)
 
Title:

A 14.6 billion degrees of freedom, 5 teraflops, 2.5 terabyte earthquake simulation on the Earth Simulator
  Speakers/Presenter:
Dimitri Komatitsch (California Institute of Technology), Seiji Tsuboi (Institute for Frontier Research on Eath Evolution, JAMSTEC), Chen Ji (California Institute of Technology), Jeroen Tromp (California Institute of Technology)

 

Paper 4:30PM 5:00PM 36-37 David Bailey (LBNL)
 
Title:

The Space Simulator: Modeling the Universe from Supernovae to Cosmology
  Speakers/Presenter:
Michael S. Warren (LANL), Chris L. Fryer (LANL), M. Patrick Goda (LANL)
             

 

     
  Session: Gordon Bell Performance Evaluation
  Title: Performance evaluation and tuning of GRAPE-6 --- towards 40 "real" Tflops
  Chair: David Bailey (LBNL)
  Time: Wednesday, November 19, 3:30PM - 4:00PM
  Rm #: 36-37
  Speaker(s)/Author(s):  
  Junichiro Makino (Department of Astronomy, School of Science, University of Tokyo), Eiichiro Kokubo (National Astronomical Observatory of Japan), Toshiyuki Fukushige (Department of General System Studies, College of Arts and Sciences, University of Tokyo), Hiroshi Daisaka (Department of Astronomy, School of Science, University of Tokyo)
   
  Description:
  In this paper, we describe the performance characteristics of GRAPE-6, the sixth-generation special-purpose computer for gravitational many-body problems. GRAPE-6 consists of 2048 custom pipeline chips, each of which integrates six pipeline processors specialized for the calculation of gravitational interaction between particles. The GRAPE hardware performs the evaluation of the interaction. The frontend processors perform all other operations, such as the time integration of the orbits of particles, I/O, on-the-fly analysis etc. The theoretical peak speed of GRAPE-6 is 63.4 Tflops. We present the result of benchmark runs, and discuss the performance characteristics. We also present the measured performance for a few real scientific applications. The best performance so far achieved with real applications is 35.3 Tflops.
  Link: Download PDF
   

 

     
  Session: Gordon Bell Performance Evaluation
  Title: A 14.6 billion degrees of freedom, 5 teraflops, 2.5 terabyte earthquake simulation on the Earth Simulator
  Chair: David Bailey (LBNL)
  Time: Wednesday, November 19, 4:00PM - 4:30PM
  Rm #: 36-37
  Speaker(s)/Author(s):  
  Dimitri Komatitsch (California Institute of Technology), Seiji Tsuboi (Institute for Frontier Research on Eath Evolution, JAMSTEC), Chen Ji (California Institute of Technology), Jeroen Tromp (California Institute of Technology)
   
  Description:
  We use 1944 processors of the Earth Simulator to model seismic wave propagation resulting from large earthquakes. Simulations are conducted based upon the spectral-element method, a high-degree finite-element technique with an exactly diagonal mass matrix. We use a very large mesh with 5.5 billion grid points (14.6 billion degrees of freedom). We include the full complexity of the Earth, i.e., a three-dimensional wave-speed and density structure, a 3-D crustal model, ellipticity as well as topography and bathymetry. A total of 2.5 terabytes of memory is needed. Our implementation is purely based upon MPI, with loop vectorization on each processor. We obtain an excellent vectorization ratio of 99.3%, and we reach a performance of 5 teraflops (30% of the peak performance) on 38% of the machine. The very high resolution of the mesh allows us to perform fully three-dimensional calculations at seismic periods as low as 5 seconds.
  Link: Download PDF
   

 

     
  Session: Gordon Bell Performance Evaluation
  Title: The Space Simulator: Modeling the Universe from Supernovae to Cosmology
  Chair: David Bailey (LBNL)
  Time: Wednesday, November 19, 4:30PM - 5:00PM
  Rm #: 36-37
  Speaker(s)/Author(s):  
  Michael S. Warren (LANL), Chris L. Fryer (LANL), M. Patrick Goda (LANL)
   
  Description:
  The Space Simulator is a 294-processor Beowulf cluster with theoretical peak performance just below 1.5 Teraflop/s. It is based on the Shuttle XPC SS51G mini chassis. Each node consists of a 2.53 GHz Pentium 4 processor, 1 Gb of 333 MHz DDR SDRAM, an 80 Gbyte Maxtor hard drive, and a 3Com 3C996B-T gigabit ethernet card. The network is made up of a Foundry FastIron 1500 and 800 Gigabit Ethernet switch. Each individual node cost less than $1000, and the entire system cost under $500,000. The cluster achieved Linpack performance of 665.1 Gflop/s on 288 processors in October 2002, making it the 85th fastest computer in the world according to the 20th TOP500 list. Performance has since improved to 757.1 Linpack Gflop/s, ranking at #90 on the 21st TOP500 list. This is the first machine in the TOP100 to surpass Linpack price/performance of 1 dollar per Mflop/s.
  Link: Download PDF