skip to main content
10.1145/2933349.2933353acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

SSD in-storage computing for list intersection

Published: 26 June 2016 Publication History

Abstract

Recently, there has been a renewed interest of in-storage computing in the context of solid state drives (SSDs), called "Smart SSDs." Smart SSDs allow application-specific code to execute inside SSDs. This allows applications to take advantage of the high internal bandwidth that Smart SSDs provide. This work studies the offloading of list intersection into Smart SSDs, because intersection is prominent in both search engines and analytics queries. Furthermore, intersection is interesting because the algorithms are more complex than plain scans; they are affected by multiple parameters, as we show, and provide lessons that can be used in other operations also.
We are interested to know whether Smart SSDs can accelerate the processing of list intersection and reduce the consumed energy. Intuitively, the answer is yes. However, the performance tradeoffs on real devices are complex. We implement list intersection into a real Samsung Smart SSD research prototype. We also provide an analytical model to understand the key factors to the overall performance, and when list intersection can benefit from Smart SSDs. Finally, we conduct experiments on the Samsung Smart SSD. Based on the results (both analytical and experimental), we provide many suggestions for both SSD vendors on how to manufacture powerful Smart SSDs and for applications on how to make full use of the functionalities that Smart SSDs provide.

References

[1]
A. Acharya, M. Uysal, and J. Saltz. Active disks: programming model, algorithms and evaluation. In ASPLOS, pages 81--91, 1998.
[2]
D. Bae, J. Kim, S. Kim, H. Oh, and C. Park. Intelligent ssd: a turbo for big data mining. In CIKM, pages 1573--1576, 2013.
[3]
R. Balasubramonian, J. Chang, T. Manning, J. H. Moreno, R. Murphy, R. Nair, and S. Swanson. Near-data processing: insights from a micro-46 workshop. Micro, IEEE, 34(4):36--42, 2014.
[4]
S. Cho, C. Park, H. Oh, S. Kim, Y. Yi, and G. R. Ganger. Active disk meets flash: a case for intelligent ssds. In ICS, pages 91--102, 2013.
[5]
I. S. Choi, W. Yang, and Y. Kee. Early experience with optimizing I/O performance using high-performance ssds for in-memory cluster computing. In BigData, pages 1073--1083, 2015.
[6]
T. Claburn. Google plans to use intel SSD storage in servers. http://www.networkcomputing.com/storage/google-plans-to-use-intel-ssd-storage-in-servers/d/d-id/1067741, 2008.
[7]
J. S. Culpepper and A. Moffat. Efficient set intersection for inverted indexing. TOIS, 29(1):1--25, 2010.
[8]
A. De, M. Gokhale, R. Gupta, and S. Swanson. Minerva: accelerating data analysis in next-generation ssds. In FCCM, pages 9--16, 2013.
[9]
E. D. Demaine, A. López-Ortiz, and J. I. Munro. Adaptive set intersections, unions, and differences. In SODA, pages 743--752, 2000.
[10]
B. Ding and A. C. König. Fast set intersection in memory. PVLDB, 4(4):255--266, 2011.
[11]
J. Do, Y.-S. Kee, J. M. Patel, C. Park, K. Park, and D. J. DeWitt. Query processing on smart ssds: opportunities and challenges. In SIGMOD, pages 1221--1230, 2013.
[12]
Jülich Research Center. Blue gene active storage boosts i/o performance at jsc. http://cacm.acm.org/news/169841-blue-gene-active-storage-boosts-i-o-performance-at-jsc, 2013.
[13]
Y. Kang, Y. Kee, E. L. Miller, and C. Park. Enabling cost-effective data processing with smart ssd. In MSST, pages 1--12, 2013.
[14]
K. Kannan. The design of a mass memory for a database computer. In ISCA, pages 44--51, 1978.
[15]
K. Keeton, D. A. Patterson, and J. M. Hellerstein. A case for intelligent disks (idisks). SIGMOD Rec., 27(3):42--52, 1998.
[16]
S. Kim, H. Oh, C. Park, S. Cho, and S. Lee. Fast, energy efficient scan inside flash memory. In ADMS, pages 36--43, 2011.
[17]
S. Kim, H. Oh, C. Park, S. Cho, S.-W. Lee, and B. Moon. In-storage processing of database scans and joins. Information Sciences, 327:183--200, 2016.
[18]
W. Lang and J. M. Patel. Energy management for mapreduce clusters. PVLDB, 2010.
[19]
C. D. Manning, P. Raghavan, and H. Schtze. Introduction to Information Retrieval. Cambridge University Press, 2008.
[20]
R. Merritt. Facebook likes wimpy cores, cpu subscriptions. http://www.eetimes.com/document.asp?doc_id=1261990, 2012.
[21]
M. Miller. Bing's new back-end: cosmos and tigers and scope. http://searchenginewatch.com/sew/news/2116057/bings-cosmos-tigers-scope-oh, 2011.
[22]
Oracle Corporation. Oracle exadata white paper, 2010.
[23]
E. Riedel, G. A. Gibson, and C. Faloutsos. Active storage for large-scale data mining and multimedia. In VLDB, pages 62--73, 1998.
[24]
S. Seshadri, M. Gahagan, S. Bhaskaran, T. Bunker, A. De, Y. Jin, Y. Liu, and S. Swanson. Willow: a user-programmable ssd. In OSDI, pages 67--80, 2014.
[25]
M. Smolaks. Google is testing qualcomm's 24-core arm chipset. http://www.datacenterdynamics.com/servers-storage/report-google-is-testing-qualcomms-24-core-arm-chipset/95681.fullarticle, 2016.
[26]
S. Y. W. Su and G. J. Lipovski. Cassm: a cellular system for very large data bases. In VLDB, pages 456--472, 1975.
[27]
S. Tatikonda, B. B. Cambazoglu, and F. P. Junqueira. Posting list intersection on multicore architectures. In SIGIR, pages 963--972, 2011.
[28]
Teradata Corporation. Teradata extreme performance alliance. http://www.teradata.com/t/extreme-performance-appliance.
[29]
H.-W. Tseng, Q. Zhao, Y. Zhou, M. Gahagan, and S. Swanson. Morpheus: creating application objects efficiently for heterogeneous computing. In ISCA, 2016.
[30]
J. Wang, E. Lo, M. L. Yiu, J. Tong, G. Wang, and X. Liu. The impact of solid state drive on search engine cache management. In SIGIR, pages 693--702, 2013.
[31]
J. Wang, E. Lo, M. L. Yiu, J. Tong, G. Wang, and X. Liu. Cache design of ssd-based search engine architectures: an experimental study. TOIS, 32(4):1--26, 2014.
[32]
L. Woods, Z. István, and G. Alonso. Ibex - an intelligent storage engine with support for advanced sql off-loading. PVLDB, 7(11):963--974, 2014.
[33]
L. Woods, J. Teubner, and G. Alonso. Less watts, more performance: an intelligent storage engine for data appliances. In SIGMOD, pages 1073--1076, 2013.
[34]
S. L. Xi, O. Babarinsa, M. Athanassoulis, and S. Idreos. Beyond the wall: near-data processing for databases. In DaMoN, 2015.

Cited By

View all
  • (2024)SmartGraph: A Framework for Graph Processing in Computational StorageProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698538(737-754)Online publication date: 20-Nov-2024
  • (2024)Asynchronous Compaction Acceleration Scheme for Near-data Processing-enabled LSM-tree-based KV StoresACM Transactions on Embedded Computing Systems10.1145/362609723:6(1-33)Online publication date: 11-Sep-2024
  • (2024)An Analytical Model-based Capacity Planning Approach for Building CSD-based Storage SystemsACM Transactions on Embedded Computing Systems10.1145/362367723:6(1-25)Online publication date: 11-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DaMoN '16: Proceedings of the 12th International Workshop on Data Management on New Hardware
June 2016
89 pages
ISBN:9781450343190
DOI:10.1145/2933349
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. in-storage computing
  2. list intersection
  3. smart SSD

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'16
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 94 of 127 submissions, 74%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)64
  • Downloads (Last 6 weeks)5
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)SmartGraph: A Framework for Graph Processing in Computational StorageProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698538(737-754)Online publication date: 20-Nov-2024
  • (2024)Asynchronous Compaction Acceleration Scheme for Near-data Processing-enabled LSM-tree-based KV StoresACM Transactions on Embedded Computing Systems10.1145/362609723:6(1-33)Online publication date: 11-Sep-2024
  • (2024)An Analytical Model-based Capacity Planning Approach for Building CSD-based Storage SystemsACM Transactions on Embedded Computing Systems10.1145/362367723:6(1-25)Online publication date: 11-Sep-2024
  • (2024)Optimizing LSM-based indexes for disaggregated memoryThe VLDB Journal10.1007/s00778-024-00863-y33:6(1813-1836)Online publication date: 19-Jun-2024
  • (2024)The Future of High Performance Computing in Biomimetics and Some ChallengesHigh Performance Computing in Biomimetics10.1007/978-981-97-1017-1_15(287-303)Online publication date: 21-Mar-2024
  • (2024)SmartSSD-Accelerated Cryptographic Shuffling for Enhancing Database SecurityData and Applications Security and Privacy XXXVIII10.1007/978-3-031-65172-4_4(55-70)Online publication date: 13-Jul-2024
  • (2023)SERICO: Scheduling Real-Time I/O Requests in Computational Storage Drives2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137101(1-6)Online publication date: Apr-2023
  • (2023)Deploying Computational Storage for HTAP DBMSs Takes More Than Just Computation OffloadingProceedings of the VLDB Endowment10.14778/3583140.358316116:6(1480-1493)Online publication date: 20-Apr-2023
  • (2023)Abakus: Accelerating k-mer Counting with Storage TechnologyACM Transactions on Architecture and Code Optimization10.1145/363295221:1(1-26)Online publication date: 21-Nov-2023
  • (2023)ISP Agent: A Generalized In-storage-processing Workload Offloading Framework by Providing Multiple Optimization OpportunitiesACM Transactions on Architecture and Code Optimization10.1145/363295121:1(1-24)Online publication date: 14-Nov-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media