Performance Programming
           
Event Type Start Time End Time Rm # Chair  

 

Paper 10:30AM 11:00AM 38-39 Robert Lucas (USC/ISI)
 
Title:

SCALLOP: A Highly Scalable Parallel Poisson Solver in Three Dimensions
  Speakers/Presenter:
Gregory T. Balls (University of California, San Diego/SDSC), Scott B. Baden (University of California, San Diego), Phillip Colella (Lawrence Berkeley National Laboratory)

 

Paper 11:00AM 11:30AM 38-39 Robert Lucas (USC/ISI)
 
Title:

Parallel Iterative Solvers of GeoFEM with Selective Blocking Preconditioning for Nonlinear Contact Problems on the Earth Simulator
  Speakers/Presenter:
Kengo Nakajima (RIST)

 

Paper 11:30AM 12:00PM 38-39 Robert Lucas (USC/ISI)
 
Title:

Multi-Constraint Mesh Partitioning for Contact/Impact Computations
  Speakers/Presenter:
George Karypis (Department of Computer Science & Engineering, University of Minnesota)
             

 

     
  Session: Performance Programming
  Title: SCALLOP: A Highly Scalable Parallel Poisson Solver in Three Dimensions
  Chair: Robert Lucas (USC/ISI)
  Time: Wednesday, November 19, 10:30AM - 11:00AM
  Rm #: 38-39
  Speaker(s)/Author(s):  
  Gregory T. Balls (University of California, San Diego/SDSC), Scott B. Baden (University of California, San Diego), Phillip Colella (Lawrence Berkeley National Laboratory)
   
  Description:
  SCALLOP is a highly scalable solver and library for elliptic partial differential equations on regular block-structured domains. SCALLOP avoids high communication overheads algorithmically by taking advantage of the locality properties inherent to solutions to elliptic PDEs. Communication costs are small, on the order of a few percent of the total running time on up to 1024 processors of NPACI's and NERSC's IBM Power-3 SP sytems. SCALLOP trades off numerical overheads against communication. These numerical overheads are independent of the number of processors for a wide range of problem sizes. SCALLOP is implicitly designed for infinite domain (free space) boundary conditions, but the algorithm can be reformulated to accommodate other boundary conditions. The SCALLOP library is built on top of the KeLP programming system and runs on a variety of platforms.
  Link: Download PDF
   

 

     
  Session: Performance Programming
  Title: Parallel Iterative Solvers of GeoFEM with Selective Blocking Preconditioning for Nonlinear Contact Problems on the Earth Simulator
  Chair: Robert Lucas (USC/ISI)
  Time: Wednesday, November 19, 11:00AM - 11:30AM
  Rm #: 38-39
  Speaker(s)/Author(s):  
  Kengo Nakajima (RIST)
   
  Description:
  An efficient parallel iterative method with selective blocking preconditioning has been developed for symmetric multiprocessor (SMP) cluster architectures with vector processors such as the Earth Simulator. This method is based on a three-level hybrid parallel programming model, which includes message passing for inter-SMP node communication, loop directives by OpenMP for intra-SMP node parallelization and vectorization for each processing element (PE). This method provides robust and smooth convergence and excellent vector and parallel performance in 3D geophysical simulations with contact conditions performed on the Earth Simulator. The selective blocking preconditioning is much more efficient than ILU(1) and ILU(2). Performance for the complicated Southwest Japan model with more than 23 M DOF on 10 SMP nodes (80 PEs) of the Earth Simulator was 161.7 GFLOPS, corresponding to 25.3% of the peak performance for hybrid programming model, and 190.4 GFLOPS (29.8% of the peak performance) for flat MPI, respectively.
  Link: Download PDF
   

 

     
  Session: Performance Programming
  Title: Multi-Constraint Mesh Partitioning for Contact/Impact Computations
  Chair: Robert Lucas (USC/ISI)
  Time: Wednesday, November 19, 11:30AM - 12:00PM
  Rm #: 38-39
  Speaker(s)/Author(s):  
  George Karypis (Department of Computer Science & Engineering, University of Minnesota)
   
  Description:
  We present a novel approach for decomposing contact/impact computations. Effective decomposition of these computations poses a number of challenges as it needs to both balance the computations and minimize the amount of communication that is performed during the finite element and the contact search phase. Our approach achieves the first goal by partitioning the underlying mesh such that it simultaneously balances both the work that is performed during the finite element phase and that performed during contact search phase, while producing subdomains whose boundaries consist of piecewise axes-parallel lines or planes. The second goal is achieved by using a decision tree to decompose the space into rectangular or box-shaped regions that contain contact points from a single partition. Our experimental evaluation on a sequence of 100 meshes, shows that this new approach can reduce the overall communication overhead over existing algorithms.
  Link: Download PDF