Skip to main content

A Moldable Online Scheduling Algorithm and Its Application to Parallel Short Sequence Mapping

  • Conference paper
Job Scheduling Strategies for Parallel Processing (JSSPP 2010)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6253))

Included in the following conference series:

Abstract

A crucial step in DNA sequence analysis is mapping short sequences generated by next-generation instruments to a reference genome. In this paper, we focus on efficient online scheduling of multi-user parallel short sequence mapping queries on a multiprocessor system. With the availability of parallel execution models, the problem at hand becomes a moldable task scheduling problem where the number of processors needed to execute a task is determined by the scheduler. We propose an online scheduling algorithm to minimize the stretch of the tasks in the system. This metric provides improved fairness to small tasks compared to flow time metric and suits well to the nature of the problem. Experimental evaluation on two workload scenarios indicate that the algorithm results in significantly smaller stretch compared to a recent algorithm and it is more fair to small sized tasks.

This work was supported in parts by the U.S. DOE SciDAC Institute Grant DE-FC02-06ER2775; by the U.S. National Science Foundation under Grants CNS-0643969, OCI-0904809, OCI-0904802 and CNS-0403342; and an allocation of computing time from the Ohio Supercomputer Center.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Applied Biosystems, MapReads: SOLiD System Color Space Mapping Tool, http://solidsoftwaretools.com/gf/project/mapreads/

  2. Smith, A.D., Xuan, Z., Zhang, M.Q.: Using quality scores and longer reads improves accuracy of solexa read mapping. BMC Bioinformatics 9(1), 128 (2008)

    Article  Google Scholar 

  3. Li, H., Ruan, J., Durbin, R.: Mapping short dna sequencing reads and calling variants using mapping quality scores. Genome Research 18(11), 1851–1858 (2008)

    Article  Google Scholar 

  4. Li, R., Yu, C., Li, Y., Lam, T.W.W., Yiu, S.M.M., Kristiansen, K., Wang, J.: SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)

    Article  Google Scholar 

  5. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short dna sequences to the human genome. Genome Biology 10(3), R25 (2009)

    Article  Google Scholar 

  6. Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment search tool. Journal of Molecular Biology 215, 403–410 (1990)

    Article  Google Scholar 

  7. Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proc. National Academy of Sciences 85, 2444–2448 (1988)

    Article  Google Scholar 

  8. Zhang, Z., Schwartz, S., Wagner, L., Miller, W.: A greedy algorithm for aligning DNA sequences. Journal of Computational Biology 7(1/2), 203–214 (2000)

    Article  Google Scholar 

  9. Davies, K.: Pacific Biosciences preparing the 15-minute genome by 2013. Bio IT World (2008)

    Google Scholar 

  10. Bozdag, D., Barbacioru, C.C., Catalyurek, U.: Parallel short sequence mapping for high throughput genome sequencing. In: Proc. of the International Parallel and Distributed Processing Symposium (2009)

    Google Scholar 

  11. Turek, J., Wolf, J.L., Yu, P.S.: Approximate algorithms scheduling parallelizable tasks. In: Proc. of the fourth Symposium on Parallel Algorithms and Architectures, pp. 323–332. ACM, New York (1992)

    Google Scholar 

  12. Feitelson, D.G., Rudolph, L., Schwiegelshohn, U., Sevcik, K.C., Wong, P.: Theory and practice in parallel job scheduling. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1997 and JSSPP 1997. LNCS, vol. 1291, pp. 1–34. Springer, Heidelberg (1997)

    Chapter  Google Scholar 

  13. Bender, M., Muthukrishnan, S., Rajaraman, R.: Improved algorithms for stretch scheduling. In: Proc. of the Symposium on Discrete Algorithms, pp. 762–771 (2002)

    Google Scholar 

  14. Legrand, A., Su, A., Vivien, F.: Minimizing the stretch when scheduling flows of biological requests. In: Proc. of the Symposium on Parallelism in Algorithms and Architectures (2006)

    Google Scholar 

  15. Jansen, K., Porkolab, L.: Linear-time approximation schemes for scheduling malleable parallel tasks. In: Proc. of 10th SODA, pp. 490–498 (1999)

    Google Scholar 

  16. Mounie, G., Rapine, C., Trystram, D.: A 3/2-approximation algorithm for scheduling independent monotonic malleable tasks. SIAM J. Comput. 37(2), 401–412 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  17. Drozdowski, M., Dell’Olmo, P.: Scheduling multiprocessor tasks for mean flow time criterion. Computers and Operations Research 27(6), 571–585 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  18. Sabin, G., Lang, M., Sadayappan, P.: Moldable parallel job scheduling using job efficiency: An iterative approach. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2006. LNCS, vol. 4376, pp. 94–114. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  19. Srinivasan, S., Subramani, V., Kettimuthu, R., Holenarsipur, P., Sadayappan, P.: Effective selection of partition sizes for moldable scheduling of parallel jobs. In: Sahni, S.K., Prasanna, V.K., Shukla, U. (eds.) HiPC 2002. LNCS, vol. 2552, pp. 174–183. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  20. Muthukrishnan, S., Rajaraman, R., Shaheen, A., Gehrke, J.: Online scheduling to minimize average stretch. In: Proc. of FOCS, pp. 433–443 (1999)

    Google Scholar 

  21. Srinivasan, S., Krishnamoorthy, S., Sadayappan, P.: A robust scheduling technology for moldable scheduling of parallel jobs. In: Proc. of Cluster 2003, pp. 92–99 (2003)

    Google Scholar 

  22. Srinivasan, S., Kettimuthu, R., Subramani, V.: Selective reservation strategies for backfill job scheduling. In: Blaze, M. (ed.) FC 2002. LNCS, vol. 2357, pp. 55–71. Springer, Heidelberg (2003)

    Google Scholar 

  23. Garey, M.R., Johnson, D.S.: Computers and Intractability. Freeman, New York (1979)

    MATH  Google Scholar 

  24. Feitelson, D.: Parallel workloads archive, http://www.cs.huji.ac.il/labs/parallel/workload/

  25. Downey, A.B.: A parallel workload model and its implications for processor allocation. Cluster Computing 1(1), 133–145 (1998)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saule, E., Bozdağ, D., Catalyurek, U.V. (2010). A Moldable Online Scheduling Algorithm and Its Application to Parallel Short Sequence Mapping. In: Frachtenberg, E., Schwiegelshohn, U. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 2010. Lecture Notes in Computer Science, vol 6253. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16505-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16505-4_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16504-7

  • Online ISBN: 978-3-642-16505-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics