skip to main content
10.1145/3383783.3383806acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbraConference Proceedingsconference-collections
research-article

QCKer: An x86-AVX/AVX2 Implementation of Q-gram Counting Filter for DNA Sequence Alignment

Published:17 April 2020Publication History

ABSTRACT

The paper presents the implementation of the q-gram counting filter using x86-AVX/AVX2 SIMD instructions. There are three novel findings during the course of the research work. First, to eliminate inconsistency between the theoretical and experimental result, synthetic reads are generated using DNA character "T" only since generated synthetic reads create a random condition in which the number of seed instances is variable, and thus cannot be predicted. Second, the presence and absence of various SIMD parameters namely, prefetch, multithreading and AVX instruction sets are introduced to determine the speed factor. Result shows that there is a 2% speedup with the presence of prefetching, a 2.7% speedup with the presence of AVX instruction sets, a 100.41% speedup with the presence of multithreading, and a 112.25%) speedup if all parameters are used. This shows that multithreading has the biggest effect among the said parameters. Third, the x86-AVX is compared with Razers3, an existing read mapper using q-gram counting filter. In terms of filter only, the x86-AVX is 12x faster than the Razers3 for small seed size of 4. Though, Razers3 outperforms the x86-AVX implementation for longer seed (i.e., seed size of 12). This is attributed to Razers3 being optimized for q-gram of 12 or higher. From these findings, it is recommended that using real datasets is preferred over synthetic datasets. Also, implementation using multithreading approach is recommended. Though future work can be done to compare multithread with FPGA implementation.

References

  1. S. B. Foley, J. J. Rios, V. E. Mgbemena, L. S. Robinson, H. L. Hampel, A. E. Toland, L. Durham, and L. S. Ross, "Use of Whole Genome Sequencing for Diagnosis and Discovery in the Cancer Genetics Clinic," Advances in Pediatrics, Jan. 2015. DOI=https://www.ncbi.nlm.nih.gov/pubmed/26023681Google ScholarGoogle Scholar
  2. Illumina, "NovaSeq™ 6000 Sequencing System," Sequencing and array-based solutions for genetic research. Illumina Hiseq X Ten preliminary system specification sheet. DOI= https://www.illumina.com/content/dam/illumina-marketing/documents/products/datasheets/novaseq-6000-system-specification-sheet-770-2016-025.pdfGoogle ScholarGoogle Scholar
  3. D. Sims, I. Sudber, N. E. Ilott, A. Heger, and C. P. Ponting, "Sequencing depth and coverage: key considerations in genomic analyses," Nature Reviews, vol 15, pp. 121--132, Feb. 2014Google ScholarGoogle ScholarCross RefCross Ref
  4. M. Alser, H. Hassan, H. Xin, O. Ergin, O. Mutlu, and C. Alkan, "GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping," Bioinformatics, vol. 33, no. 21, pp. 3355--3363, Nov. 2017Google ScholarGoogle ScholarCross RefCross Ref
  5. J. S. Kim et al., "GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies," BMC Genomics, vol. 19, no. S2, May 2018Google ScholarGoogle ScholarCross RefCross Ref
  6. M. Jalali, and J. Gamieldien, "A practical guide to filtering and prioritizing genetic variants," Advances in pediatrics, Jan. 2017. DOI=https://www.ncbi.nlm.nih.gov/pubmed/28118812Google ScholarGoogle Scholar
  7. S. Burkhardt, "Filter algorithms for approximate string matching," Saarland UniversityGoogle ScholarGoogle Scholar
  8. H. Xin et al., "Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping," Bioinformatics, vol. 31, no. 10, pp. 1553--1560, May 2015Google ScholarGoogle ScholarCross RefCross Ref
  9. K. R. Rasmussen, J. Stoye, and E. W. Myers, "Efficient q-gram filters for finding all epsilon-matches over a given length," J. Comput. Biol., vol. 13, no. 2, pp. 296--308, Mar. 2006.Google ScholarGoogle ScholarCross RefCross Ref
  10. D. Weese, M. Holtgrewe, and K. Reinert, "RazerS 3: Faster, fully sensitive read mapping," Bioinformatics, vol. 28, no. 20, pp. 2592--2599, Oct. 2012Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Intel® 64 and IA-32 Architectures Software Developer Manuals. DOI = https://software.intel.com/en-us/articles/intelsdmGoogle ScholarGoogle Scholar

Index Terms

  1. QCKer: An x86-AVX/AVX2 Implementation of Q-gram Counting Filter for DNA Sequence Alignment

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICBRA '19: Proceedings of the 6th International Conference on Bioinformatics Research and Applications
      December 2019
      169 pages
      ISBN:9781450372183
      DOI:10.1145/3383783

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 April 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader