HPC Infrastructure I
           
Event Type Start Time End Time Rm # Chair  

 

Masterworks 1:30PM 2:15PM 16-18 Ray Paden (IBM)
 
Title:

Ten Years on the Grid - Production Design Using Large Scale Grid Computing at Pratt & Whitney
  Speakers/Presenter:
Peter C. Bradley (Pratt & Whitney)

 

Masterworks 2:15PM 3:00PM 16-18 Ray Paden (IBM)
 
Title:

MPI-IO for Portable High Performance Parallel I/O
  Speakers/Presenter:
Richard Treumann (Unix Development Laboratory, IBM Server Group)
             

 

     
  Session: HPC Infrastructure I
  Title: Ten Years on the Grid - Production Design Using Large Scale Grid Computing at Pratt & Whitney
  Chair: Ray Paden (IBM)
  Time: Tuesday, November 18, 1:30PM - 2:15PM
  Rm #: 16-18
  Speaker(s)/Author(s):  
  Peter C. Bradley (Pratt & Whitney)
   
  Description:
  In the late 1980s, Computational Fluid Dynamics showed great potential as a tool for design of Pratt & Whitney's jet engine and aerospace products. However, the enormous compute requirements necessary to integrate CFD into the design process rapidly overwhelmed Pratt's supercomputer. A small group of jet engine designers hatched a plan to harness the unused power of desktop workstations to run supercomputer-class CFD simulations. Software was developed to manage available resources and to enable groups of desktop computers to reliably work together to run simulations. In 1992, Pratt & Whitney pulled the plug on its last supercomputer, entrusting the future of its design process to a paradigm that Pratt called Prowess, and that we now know as Grid computing.

Today, dozens of parallel Pratt & Whitney design systems share thousands of desktop and dedicated processors every day. The fundamental challenge - absolute delivery of results from parallel programs "running in the dark" on a dynamically changing pool of devices - has not changed. However, the technology level required to scale to tens, then hundreds, then thousands of processors was substantially different. The evolution of Prowess continues to teach us lessons about the design and management of cost-effective parallel systems on a broadly dispersed and inherently unreliable grid. We find that successful production computing often requires a level of fault tolerance and robustness beyond what is available in mainstream HPC tools. We also find that the keys to success are diverse and sometimes counterintuitive, ranging from corporate culture to intense technology.

Additional information about this work is available at http://www.pw.utc.com/.
  Link: --
   

 

     
  Session: HPC Infrastructure I
  Title: MPI-IO for Portable High Performance Parallel I/O
  Chair: Ray Paden (IBM)
  Time: Tuesday, November 18, 2:15PM - 3:00PM
  Rm #: 16-18
  Speaker(s)/Author(s):  
  Richard Treumann (Unix Development Laboratory, IBM Server Group)
   
  Description:
  Any massively parallel MPI application is likely to read and write some very large data sets. The application exists to solve a single problem but to solve the problem with a number of MPI tasks the problem must be broken into pieces and each MPI task assigned a small piece. It is sometimes possible to use one file per task to parallelize the I/O but such a file collection is only useful for another MPI job organized the same way. It is most natural to use an input or output file which represents the entire problem rather than one file per task. Using one file requires moving data between this single huge file and the distributed tasks, each of which utilizes only a small portion of that data. The techniques available with standard file read/write result in performance bottlenecks or depend on complex, hand crafted data marshalling code in the application. Special parallel filesystems with nonstandard interfaces may be an option but the code becomes nonportable.

MPI-IO's greatest strength is its ability to express the collective view of a file I/O operation in which each task reads or writes parts of a single file. When the job wide view of an I/O operation is given to MPI by a collective call, the MPI implementation can do global analysis and carry out the operation by marshalling data and, perhaps, by exploiting nonstandard filesystem interfaces. Any application using MPI has already required careful design to provide suitable data structures, decompose the work and organize the data. With MPI-IO, the file I/O can be specified in terms already used to specify the data organization and message passing. By extending MPI message passing concepts to file I/O, MPI-IO leverages the design work done in writing an application and moves the detail of optimizing for specific hardware or filesystems to the MPI implementation.
  Link: --