Tools and Services for Grids
           
Event Type Start Time End Time Rm # Chair  

 

Paper 3:30PM 4:00PM 40-41 Roland Wismüller (LRR-TUM, Technische Universität München)
 
Title:

Nondeterministic Queries in a Relational Grid Information Service
  Speakers/Presenter:
Peter Dinda (Northwestern University, Computer Science), Dong Lu (Northwestern University, Computer Science)

 

Paper 4:00PM 4:30PM 40-41 Roland Wismüller (LRR-TUM, Technische Universität München)
 
Title:

Optimizing Reduction Computations In a Distributed Environment
  Speakers/Presenter:
Tahsin Kurc (Ohio State University), Feng Lee (Ohio State University), Gagan Agrawal (Ohio State University), Umit Catalyurek (Ohio State University), Renato Ferreira (Ohio State University), Joel Saltz (Ohio State University)

 

Paper 4:30PM 5:00PM 40-41 Roland Wismüller (LRR-TUM, Technische Universität München)
 
Title:

Job Superscheduler Architecture and Performance in Computational Grid Environments
  Speakers/Presenter:
Hongzhang Shan (Lawrence Berkeley National Laboratory), Leonid Oliker (Lawrence Berkeley National Laboratory), Rupak Biswas (NASA Ames Research Center)
             

 

     
  Session: Tools and Services for Grids
  Title: Nondeterministic Queries in a Relational Grid Information Service
  Chair: Roland Wismüller (LRR-TUM, Technische Universität München)
  Time: Tuesday, November 18, 3:30PM - 4:00PM
  Rm #: 40-41
  Speaker(s)/Author(s):  
  Peter Dinda (Northwestern University, Computer Science), Dong Lu (Northwestern University, Computer Science)
   
  Description:
  A Grid Information Service (GIS) stores information about the resources of a distributed computing environment and answers questions about it. We are developing RGIS, a GIS system based on the relational data model. RGIS users can write SQL queries that search for complex compositions of resources that meet collective requirements. Executing these queries can be very expensive, however. In response, we introduce the nondeterministic query, an extension to the SELECT statement, which allows the user (and RGIS) to trade off between the query's running time and the number of results. The results are a random sample of the deterministic results, which we argue is sufficient and appropriate. Herein we describe RGIS, the nondeterministic query extension, and its implementation. Our evaluation shows that a meaningful tradeoff between query time and results returned is achievable, and that the tradeoff can be used to keep query time largely independent of query complexity.
  Link: Download PDF
   

 

     
  Session: Tools and Services for Grids
  Title: Optimizing Reduction Computations In a Distributed Environment
  Chair: Roland Wismüller (LRR-TUM, Technische Universität München)
  Time: Tuesday, November 18, 4:00PM - 4:30PM
  Rm #: 40-41
  Speaker(s)/Author(s):  
  Tahsin Kurc (Ohio State University), Feng Lee (Ohio State University), Gagan Agrawal (Ohio State University), Umit Catalyurek (Ohio State University), Renato Ferreira (Ohio State University), Joel Saltz (Ohio State University)
   
  Description:
  We investigate runtime strategies for data-intensive applications that involve generalized reductions on large, distributed datasets. Our set of strategies includes replicated filter state, partitioned filter state, and hybrid options between these two extremes. We evaluate these strategies using emulators of three real applications, different query and output sizes, and a number of configurations. We consider execution in a homogenous cluster and in a distributed environment where only a subset of nodes host the data. Our results show replicating the filter state scales well and outperforms other schemes, if sufficient memory is available and sufficient computation is involved to offset the cost of global merge step. In other cases, hybrid is usually the best. Moreover, in almost all cases, the performance of the hybrid strategy is quite close to the best strategy. Thus, we believe that hybrid is an attractive approach when the relative performance of different schemes cannot be predicted.

This paper has been nominated for the Best Paper of SC2003 award.
  Link: Download PDF
   

 

     
  Session: Tools and Services for Grids
  Title: Job Superscheduler Architecture and Performance in Computational Grid Environments
  Chair: Roland Wismüller (LRR-TUM, Technische Universität München)
  Time: Tuesday, November 18, 4:30PM - 5:00PM
  Rm #: 40-41
  Speaker(s)/Author(s):  
  Hongzhang Shan (Lawrence Berkeley National Laboratory), Leonid Oliker (Lawrence Berkeley National Laboratory), Rupak Biswas (NASA Ames Research Center)
   
  Description:
  Computational grids hold great promise in utilizing geographically separated heterogeneous resources to solve large-scale complex scientific problems. However, a number of major technical hurdles, including distributed resource management and effective job scheduling, stand in the way of realizing these gains. In this paper, we propose a novel grid superscheduler architecture and three distributed job migration algorithms. We also model the critical interaction between the superscheduler and autonomous local schedulers. Extensive performance comparisons with ideal, central, and local schemes using real workloads from leading computational centers are conducted in a simulation environment. Additionally, synthetic workloads are used to perform a detailed sensitivity analysis of our superscheduler. Several key metrics demonstrate that substantial performance gains can be achieved via smart superscheduling in distributed computational grids.
  Link: Download PDF