Algorithms and Programming
           
Event Type Start Time End Time Rm # Chair  

 

Paper 3:30PM 4:00PM 38-39 David Bader (University of New Mexico)
 
Title:

A Million-Fold Speed Improvement in Genomic Repeats Detection
  Speakers/Presenter:
John W. Romein (Vrije Universiteit, Amsterdam), Jaap Heringa (Vrije Universiteit, Amsterdam), Henri E. Bal (Vrije Universiteit, Amsterdam)

 

Paper 4:00PM 4:30PM 38-39 David Bader (University of New Mexico)
 
Title:

GridSAT: A Chaff-based Distributed SAT Solver for the Grid
  Speakers/Presenter:
Wahid Chrabakh (UC Santa Barbara), Rich Wolski (UC Santa Barbara)

 

Paper 4:30PM 5:00PM 38-39 David Bader (University of New Mexico)
 
Title:

HPC.NET - are CLI-based Virtual Machines Suitable for High Performance Computing?
  Speakers/Presenter:
Werner Vogels (Cornell University)
             

 

     
  Session: Algorithms and Programming
  Title: A Million-Fold Speed Improvement in Genomic Repeats Detection
  Chair: David Bader (University of New Mexico)
  Time: Wednesday, November 19, 3:30PM - 4:00PM
  Rm #: 38-39
  Speaker(s)/Author(s):  
  John W. Romein (Vrije Universiteit, Amsterdam), Jaap Heringa (Vrije Universiteit, Amsterdam), Henri E. Bal (Vrije Universiteit, Amsterdam)
   
  Description:
  This paper presents a novel, parallel algorithm for generating top alignments. Top alignments are used for finding internal repeats in biological sequences like proteins and genes. Our algorithm replaces an older, sequential algorithm (Repro), which was prohibitively slow for sequence lengths higher than 2000. The new algorithm is an order of magnitude faster (O(n^3) rather than O(n^4)).

The paper presents a three-level parallel implementation of the algorithm: using SIMD multimedia extensions found on present-day processors (a novel technique that can be used to parallelize any application that performs many sequence alignments), using shared-memory parallelism, and using distributed-memory parallelism. It allows processing the longest known proteins (nearly 35000 amino acids). We show exceptionally high speed improvements: between 548 and 889 on a cluster of 64 dual-processor machines, compared to the new sequential algorithm. Especially for long sequences, extreme speed improvements over the old algorithm are obtained.
  Link: Download PDF
   

 

     
  Session: Algorithms and Programming
  Title: GridSAT: A Chaff-based Distributed SAT Solver for the Grid
  Chair: David Bader (University of New Mexico)
  Time: Wednesday, November 19, 4:00PM - 4:30PM
  Rm #: 38-39
  Speaker(s)/Author(s):  
  Wahid Chrabakh (UC Santa Barbara), Rich Wolski (UC Santa Barbara)
   
  Description:
  We present GridSAT, a parallel and complete satisfiability solver designed to solve non-trivial SAT problem instances using a large number of widely distributed and heterogeneous resources. The GridSAT parallel algorithm uses intelligent backtracking, distributed and carefully scheduled sharing of learned clauses, and clause reduction. Our implementation focuses on dynamic resource acquisition and release to optimize application execution. We show how the large number of computational resources that are available from a Grid can be managed effectively for the application by an automatic scheduler and effective implementation. GridSAT execution speed is compared against the best sequential solver as rated by the SAT2002 competition using a wide variety of problem instances. The results show that GridSAT delivers speed-up for all but one of the test problem instances that are of significant size. In addition, we describe how GridSAT has solved previously unsolved satisfiability problems and the domain science contribution these results make.
  Link: Download PDF
   

 

     
  Session: Algorithms and Programming
  Title: HPC.NET - are CLI-based Virtual Machines Suitable for High Performance Computing?
  Chair: David Bader (University of New Mexico)
  Time: Wednesday, November 19, 4:30PM - 5:00PM
  Rm #: 38-39
  Speaker(s)/Author(s):  
  Werner Vogels (Cornell University)
   
  Description:
  The Common Language Infrastructure is a new, standardized virtual machine that is likely to become popular on several platforms. In this paper we review whether this technology has any future in the high-performance computing community, for example by targeting the same application space as the Java-Grande Forum. We review the technology by benchmarking three implementations of the CLI and compare those with the results on Java virtual machines.
  Link: Download PDF