Skip to main content

Advertisement

Log in

Building a large-scale object-based active storage platform for data analytics in the internet of things

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Due to consistent improvements in memory and processor technology, object storage devices (OSDs) have greater memory space and more powerful processing power, which allow the OSDs to execute user-defined programs. Shifting part of an application’s processing to the disk drives drops the amount of data transferred across the network and explores the parallelism of large-scale distributed storage systems, reducing the execution time for many basic data analytics tasks. In this paper, we propose a large-scale object-based active storage platform, named Gem, for data analytics in the internet of things (IoT). All data from the IoT that resides in disk drives form objects with attributes, methods and policies. For some applications such as data analytics, application-specific operations are executed by the drive processors. In this way, only the results are returned to clients, rather than data files being read by the clients. Therefore, the platform Gem is able to greatly reduce the overhead of data analytics applications in the Internet of Things. By conducting performance evaluation, experimental results demonstrate the effectiveness and scalability of Gem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Evans D (2011) The internet of things how the next evolution of the internet is changing everything. http://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL

  2. Xu Q, Aung KMM, Zhu Y, Yong KL (2015) A large-scale object-based active storage platform for data analytics in the internet of things. In: The 9th International Conference on Multimedia and Ubiquitous Engineering, pp 405–413

  3. Riedel E, Gibson GA, Faloutsos C (1998) Active storage for large-scale data mining and multimedia. In: VLDB, pp 62–73

  4. Acharya A, Uysal M, Saltz JH (1998) Active disks: Programming model, algorithms and evaluation. In: ASPLOS, pp 81–91

  5. Fromm R, Perissakis S, Cardwell N, Kozyrakis CE, McGaughy B, Patterson DA, Anderson TE, Yelick KA (1997) The energy efficiency of iram architectures. In: ISCA, pp 327–337

  6. Cho S, Park C, Oh H, Kim S, Yi Y, Ganger GR (2013) Active disk meets flash: a case for intelligent ssds. In: ICS, pp 91–102

  7. Xu Q, Shen HT, Chen Z, Cui B, Zhou X, Dai Y (2009) Hybrid information retrieval policies based on cooperative cache in mobile P2P networks. Front Comput Sci China 3(3):381–395

    Article  Google Scholar 

  8. Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: MSST, pp 1–10

  9. Boumkheld N, Ghogho M, Koutbi ME (2015) Energy consumption scheduling in a smart grid including renewable energy. J Inf Process Syst 11(1):116–124

    Google Scholar 

  10. Vanus J, Smolon M, Martinek R, Koziorek J, Zidek J, Bilik P (2015) Testing of the voice communication in smart home care. Human-centr Comput Inf Sci 5(15):1–22

    Google Scholar 

  11. Stoica I, Morris R, Karger DR, Kaashoek MF, Balakrishnan H (2001) Chord: a scalable peer-to-peer lookup service for internet applications. In: SIGCOMM, pp 149–160

  12. Xu Q, Arumugam RV, Yong KL, Mahadevan S (2014) Efficient and scalable metadata management in eb-scale file systems. IEEE Trans Parallel Distrib Syst 25(11):2840–2850

    Article  Google Scholar 

  13. Chekuri C, Khanna S (2005) A polynomial time approximation scheme for the multiple knapsack problem. SIAM J Comput 35(3):713–728

    Article  MathSciNet  MATH  Google Scholar 

  14. Xu Q, Arumugam RV, Yong KL, Mahadevan S (2013) Drop: Facilitating distributed metadata management in eb-scale storage systems. In: MSST, pp 1–10

  15. Xu Q, Xi W, Yong KL, Jin C (2015) Concurrent regeneration code with local reconstruction in distributed storage systems. In: The 9th international conference on multimedia and ubiquitous engineering, pp 415-422

  16. Weber RO (2009) Scsi object-based storage device commands-2 (osd-2)

  17. Mesnier M, Ganger GR, Riedel E (2003) Object-based storage. Commun Mag IEEE 41(8):84–90

    Article  Google Scholar 

  18. Welch B, Unangst M, Abbasi Z, Gibson GA, Mueller B, Small J, Zelenka J, Zhou B (2008) Scalable performance of the panasas parallel file system. In: FAST, pp 17–33

  19. Gibson GA, Meter RV (2000) Network attached storage architecture. Commun ACM 43(11):37–45

    Article  Google Scholar 

  20. Thornburgh RH, Schoenborn B (2000) Storage Area Networks. Prentice Hall PTR, USA

    Google Scholar 

  21. Ahn H, Ju M, Yoo D, Kim H, Kim Y (2014) Data analysis of fish species change depending on existence of wetland at lake paro upstream for the wireless monitoring of ecosystem. J Converg 5(3):23–27

    Google Scholar 

  22. Wang J, Shang P, Yin J (2014) Draw: a new data-grouping-aware data placement scheme for data intensive applications with interest locality. In: Cloud Computing for Data-Intensive Applications, Springer, pp 149–174

  23. Keeton K, Patterson DA, Hellerstein JM (1998) A case for intelligent disks (idisks). SIGMOD Rec 27(3):42–52

    Article  Google Scholar 

  24. Huston L, Sukthankar R, Wickremesinghe R, Satyanarayanan M, Ganger GR, Riedel E, Ailamaki A (2004) Diamond: A storage architecture for early discard in interactive search. In: FAST, pp 73–86

  25. Son SW, Lang S, Carns P, Ross R, Thakur R, Ozisikyilmaz B, Kumar P, Liao WK, Choudhary A (2010) Enabling active storage on parallel i/o software stacks. In: MSST, pp 1–12

  26. Cai Q, Arumugam RV, Xu Q, He B (2014) Understanding the Behavior of Solid State Disk. In: The 18th Asia Pacific symposium on intelligent and evolutionary systems. vol 1, pp 341–355

  27. Boboila S, Kim Y, Vazhkudai SS, Desnoyers P, Shipman GM (2012) Active flash: Out-of-core data analytics on flash storage. In: MSST, pp 1–12

  28. Tiwari D, Boboila S, Vazhkudai SS, Kim Y, Ma X, Desnoyers PJ, Solihin Y (2013) Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines. In: FAST, pp 119–132

  29. Agrawal N, Prabhakaran V, Wobber T, Davis JD, Manasse M, Panigrahy R (2008) Design tradeoffs for ssd performance. In: USENIX Annual Technical Conference, pp 57–70. http://dblp.uni-trier.de/db/conf/usenix/usenix2008.html

  30. Kim S, Oh H, Park C, Cho S, Lee SW (2011) Fast, energy efficient scan inside flash memory. In: ADMS@VLDB, pp 36–43

Download references

Acknowledgments

This work is supported by A\(^*\)STAR Thematic Strategic Research Programme (TSRP) Grant No. 1121720013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Quanqing Xu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Q., Aung, K.M.M., Zhu, Y. et al. Building a large-scale object-based active storage platform for data analytics in the internet of things. J Supercomput 72, 2796–2814 (2016). https://doi.org/10.1007/s11227-016-1621-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1621-2

Keywords

Navigation