skip to main content
10.1145/2335755.2335842acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article

Trinity RNA-Seq assembler performance optimization

Published:16 July 2012Publication History

ABSTRACT

RNA-sequencing is a technique to study RNA expression in biological material. It is quickly gaining popularity in the field of transcriptomics. Trinity is a software tool that was developed for efficient de novo reconstruction of transcriptomes from RNA-Seq data. In this paper we first conduct a performance study of Trinity and compare it to previously published data from 2011. The version from 2011 is much slower than many other de novo assemblers and biologists have thus been forced to choose between quality and speed. We examine the runtime behavior of Trinity as a whole as well as its individual components and then optimize the most performance critical parts. We find that standard best practices for HPC applications can also be applied to Trinity, especially on systems with large amounts of memory. When combining best practices for HPC applications along with our specific performance optimization, we can decrease the runtime of Trinity by a factor of 3.9. This brings the runtime of Trinity in line with other de novo assemblers while maintaining superior quality. The purpose of this paper is to describe a series of improvements to Trinity, quantify the execution improvements achieved, and document the new version of the software.

References

  1. Blacklight SGI UV 1000 at PSC. http://www.psc.edu/machines/sgi/uv/blacklight.php.Google ScholarGoogle Scholar
  2. Collectl. http://collectl.sourceforge.net.Google ScholarGoogle Scholar
  3. IU Mason Cluster. http://pti.iu.edu/hps/mason.Google ScholarGoogle Scholar
  4. K-mer Tools. http://kmer.sourceforge.net.Google ScholarGoogle Scholar
  5. National Center for Genome Analysis Support. http://ncgas.org.Google ScholarGoogle Scholar
  6. RNA-Seq De novo Assembly Using Trinity. http://trinityrnaseq.sourceforge.net.Google ScholarGoogle Scholar
  7. C. Geng, Y. KangPing, C. Wang, and S. TieLiu. De novo transcriptome assembly of RNA-Seq reads with different strategies. Science China Life Sciences, 54(12):1129--1133, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  8. M. G. Grabherr, B. J. Haas, M. Yassour, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology, 29(7):644--U130, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  9. A. Knüpfer, H. Brunst, J. Doleschal, M. Jurenz, M. Lieber, H. Mickler, M. S. Müller, and W. E. Nagel. The vampir performance analysis tool-set. In M. Resch et al., editors, Tools for High Performance Computing, pages 139--155. Springer, 2008.Google ScholarGoogle Scholar
  10. J. Malone and B. Oliver. Microarrays, deep sequencing and the true measure of the transcriptome. BMC Biology, 9(1):34+, 2011.Google ScholarGoogle Scholar
  11. G. Marcais and C. Kingsford. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764--770, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. A. Martin and Z. Wang. Next-generation transcriptome assembly. Nat Rev Genet, 12(10):671--682, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  13. C. Stewart et al. MRI: Acquisition of a High-Speed, High Capacity Storage System to Support Scientific Computing: The Data Capacitor. http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0521433.Google ScholarGoogle Scholar
  14. Q.-Y. Zhao, Y. Wang, Y.-M. Kong, D. Luo, X. Li, and P. Hao. Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinformatics, 12(14), 2011.Google ScholarGoogle Scholar

Index Terms

  1. Trinity RNA-Seq assembler performance optimization

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          XSEDE '12: Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond
          July 2012
          423 pages
          ISBN:9781450316026
          DOI:10.1145/2335755

          Copyright © 2012 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 July 2012

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate129of190submissions,68%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader