Skip to main content

Scalability of Streaming Anomaly Detection in an Unbounded Key Space Using Migrating Threads

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12728))

Abstract

Applications where streams of data are passed through large data structures are becoming of increasing importance. For instance network intrusion detection and cyber security as a whole rely on real time analysis of network traffic. Unfortunately, when implemented on conventional architectures such applications become horribly inefficient, especially when attempts are made to scale up performance via some sort of parallelism. An earlier paper discussed an implementation of the Firehose streaming benchmark that assumed only a bounded number of keys and datums. This paper discusses a significantly more complex (and more realistic) variant that analyzes continuously streaming samples from an unbounded range of keys. We utilize a novel migrating thread architecture in which threads may migrate as needed through a single system wide shared memory space, thereby avoiding conventional inefficiencies. As with the earlier paper, results are promising, with both far better scaling and increased performance over previously reported implementations, on a platform with considerably less intrinsic hardware computational resources.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://firehose.sandia.gov.

  2. 2.

    Lucata formerly EMU Solutions Inc.

  3. 3.

    https://crnch.gatech.edu/rogues-Lucata.

References

  1. Firehose benchmarks. http://firehose.sandia.gov/

  2. Bader, D.A., et al.: STINGER: spatio-temporal interaction networks and graphs (STING) extensible representation. Technical report, Georgia Institute of Technology (2009)

    Google Scholar 

  3. Bar-Yossef, Z., Kumar, R., Sivakumar, D.: Reductions in streaming algorithms, with an application to counting triangles in graphs. In: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002, pp. 623–632. Society for Industrial and Applied Mathematics, Philadelphia (2002). http://dl.acm.org/citation.cfm?id=545381.545464

  4. Becchetti, L., Boldi, P., Castillo, C., Gionis, A.: Efficient semi-streaming algorithms for local triangle counting in massive graphs. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008, pp. 16–24. ACM, New York (2008). https://doi.org/10.1145/1401890.1401898

  5. Bernstein, P.A., Goodman, N.: Timestamp-based algorithms for concurrency control in distributed database systems. In: Proceedings of the Sixth International Conference on Very Large Data Bases, VLDB 1980, vol. 6, pp. 285–300. VLDB Endowment (1980). http://dl.acm.org/citation.cfm?id=1286887.1286918

  6. Berry, J., Porter, A.: Stateful streaming in distributed memory supercomputers. In: Chesapeake Large Scale Data Analytics Conference (2016)

    Google Scholar 

  7. Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache Flink: stream and batch processing in a single engine. In: Bulletin of the Technical Committee on Data Engineering, December 2015

    Google Scholar 

  8. Dysart, T., et al.: Highly scalable near memory processing with migrating threads on the emu system architecture, November 2016. https://doi.org/10.1109/IA3.2016.7

  9. Eaton, J.: FireHose, PageRank, and nvGRAPH: GPU accelerated analytics. In: Chesapeake Large Scale Data Analytics Conference (2016)

    Google Scholar 

  10. Ediger, D., Jiang, K., Riedy, J., Bader, D.: Massive streaming data analytics: a case study with clustering coefficients, pp. 1–8, May 2010. https://doi.org/10.1109/IPDPSW.2010.5470687

  11. Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On graph problems in a semi-streaming model. Theor. Comput. Sci. 348(2), 207–216 (2005). https://doi.org/10.1016/j.tcs.2005.09.013

    Article  MathSciNet  MATH  Google Scholar 

  12. FIREHOUSE, S.B., with WATERSLIDE, E.: Karl Anderson. In: Chesapeake Large Scale Data Analytics Conference (2016)

    Google Scholar 

  13. Kogge, P.M., Butcher, N., Page, B.: Introducing streaming into linear algebra-based sparse graph algorithms, July 2019

    Google Scholar 

  14. Kogge, P.: Of piglets and threadlets: architectures for self-contained, mobile, memory programming. In: Innovative Architecture for Future Generation High-Performance Processors and Systems, pp. 130–138, January 2004. https://doi.org/10.1109/IWIA.2004.10005

  15. McGregor, A.: Graph stream algorithms: a survey. SIGMOD Rec. 43(1), 9–20 (2014). https://doi.org/10.1145/2627692.2627694

    Article  Google Scholar 

  16. Page, B.A., Kogge, P.M.: Scalability of streaming on migrating threads. In: High Performance Extreme Computing (HPEC), September 2020

    Google Scholar 

  17. Plimpton, S.J., Shead, T.: Streaming data analytics via message passing with application to graph algorithms. J. Parallel Distrib. Comput. 74(8) (2014). https://doi.org/10.1016/j.jpdc.2014.04.001

  18. Riedy, J., Bader, D.: Stinger: multi-threaded graph streaming, May 2014

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brian A. Page .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Page, B.A., Kogge, P.M. (2021). Scalability of Streaming Anomaly Detection in an Unbounded Key Space Using Migrating Threads. In: Chamberlain, B.L., Varbanescu, AL., Ltaief, H., Luszczek, P. (eds) High Performance Computing. ISC High Performance 2021. Lecture Notes in Computer Science(), vol 12728. Springer, Cham. https://doi.org/10.1007/978-3-030-78713-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-78713-4_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-78712-7

  • Online ISBN: 978-3-030-78713-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics