Skip to main content

Visual Interactive Exploration and Labeling of Large Volumes of Industrial Time Series Data

  • Conference paper
  • First Online:
Enterprise Information Systems (ICEIS 2022)

Abstract

In recent years, supervised machine learning models have become increasingly important for the advancing digitalization of the manufacturing industry. Reports from research and application show potentials in the use for application scenarios, such as predictive quality or predictive maintenance, that promise flexibility and savings. However, such data-based learning methods require a large training sets of accurately labeled sensor data that represents the manufacturing process in the digital world and allow model to learn corresponding behavioral patterns. Nevertheless, the creation of these data sets cannot be fully automated and requires the knowledge of process experts to interpret the sensor curves. Consequently, the creation of such a data set is time-consuming and expensive for the companies. Existing solutions do not meet the needs of the manufacturing industry as they cannot visualize large data sets, do not support all common sensor data forms and offer little support for efficient labeling of large data volumes. In this paper, we build on our previously presented visual interactive labeling tool Gideon-TS that is designed for handling large data sets of industrial sensor data in multiple modalities (univariate, multivariate, segments or whole time series, with and without timestamps). Gideon-TS also features an approach for semi-automatic labeling that reduces the time needed to label large volumes of data. Based on the requirements of a new use case, we extend the capabilities of our tool by improving the aggregation functionality for visualizing large data queries and by adding support for small time units. We also improve our labeling support system with an active learning component to further accelerate the labeling process. We evaluate the extended version of Gideon-TS on two industrial exemplary use cases by conducting performance tests and by performing a user study to show that our tool is suitable for labeling large volumes of industrial sensor data and significantly reduces labeling time compared to traditional labeling methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adi, E., Anwar, A., Baig, Z., Zeadally, S.: Machine learning and data analytics for the IoT. Neural Comput. Appl. 32(20), 16205–16233 (2020). https://doi.org/10.1007/s00521-020-04874-y

    Article  Google Scholar 

  2. von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2004, pp. 319–326. Association for Computing Machinery, New York (2004). https://doi.org/10.1145/985692.985733

  3. Angular CDK scrolling (2021). https://material.angular.io/cdk/scrolling/overview. Accessed 01 Nov 2021

  4. Bernard, J., Zeppelzauer, M., Sedlmair, M., Aigner, W.: VIAL: a unified process for visual interactive labeling. Vis. Comput. 34(9), 1189–1207 (2018). https://doi.org/10.1007/s00371-018-1500-3

    Article  Google Scholar 

  5. Cardoso, T.N., Silva, R.M., Canuto, S., Moro, M.M., Gonçalves, M.A.: Ranked batch-mode active learning. Inf. Sci. 379, 313–337 (2017). https://doi.org/10.1016/j.ins.2016.10.037

    Article  Google Scholar 

  6. baidu/curve (2021). https://github.com/baidu/Curve. Accessed 01 Nov 2021

  7. Danka, T., Horvath, P.: modAL: a modular active learning framework for Python (2018). https://github.com/cosmic-cortex/modAL

  8. DIN 8584-3: Manufacturing processes forming under combination of tensile and compressive conditions - part 3: deep drawing; classification, subdivision, terms and definitions. Beuth Verlag, Berlin (2003)

    Google Scholar 

  9. Dudley, J.J., Kristensson, P.O.: A review of user interface design for interactive machine learning. ACM Trans. Interact. Intell. Syst. 8(2), 1–37 (2018). https://doi.org/10.1145/3185517

    Article  Google Scholar 

  10. Eirich, J., et al.: IRVINE: a design study on analyzing correlation patterns of electrical engines. IEEE Trans. Vis. Comput. Graph. 28(1), 11–21 (2021). https://doi.org/10.1109/TVCG.2021.3114797

    Article  Google Scholar 

  11. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD 1996, pp. 226–231. AAAI Press (1996)

    Google Scholar 

  12. Heimerl, F., Koch, S., Bosch, H., Ertl, T.: Visual classifier training for text document retrieval. IEEE Trans. Vis. Comput. Graph. 18(12), 2839–2848 (2012). https://doi.org/10.1109/TVCG.2012.277

    Article  Google Scholar 

  13. Hu, B., Chen, Y., Keogh, E.: Classification of streaming time series under more realistic assumptions. Data Min. Knowl. Disc. 30(2), 403–437 (2016). https://doi.org/10.1007/s10618-015-0415-0

    Article  MathSciNet  MATH  Google Scholar 

  14. Kah, P., Suoranta, R., Martikainen, J.: Advanced gas metal arc welding processes. Int. J. Adv. Manuf. Technol. 67(1), 655–674 (2013). https://doi.org/10.1007/s00170-012-4513-5

    Article  Google Scholar 

  15. Keogh, E., Chu, S., Hart, D., Pazzani, M.: An online algorithm for segmenting time series. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 289–296 (2001). https://doi.org/10.1109/ICDM.2001.989531

  16. Langer, T., Meisen, T.: System design to utilize domain expertise for visual exploratory data analysis. Information 12(4), 140 (2021). https://doi.org/10.3390/info12040140

    Article  Google Scholar 

  17. Langer, T., Meisen, T.: Visual analytics for industrial sensor data analysis. In: Proceedings of the 23rd International Conference on Enterprise Information Systems - Volume 1: ICEIS, pp. 584–593. INSTICC, SciTePress (2021). https://doi.org/10.5220/0010399705840593

  18. Langer, T., Welbers, V., Meisen, T.: Gideon-TS: efficient exploration and labeling of multivariate industrial sensor data. In: Proceedings of the 24th International Conference on Enterprise Information Systems - Volume 1: ICEIS, pp. 321–331. INSTICC, SciTePress (2022). https://doi.org/10.5220/0011037200003179

  19. Lasi, H., Fettke, P., Kemper, H.-G., Feld, T., Hoffmann, M.: Industry 4.0. Bus. Inf. Syst. Eng. 6(4), 239–242 (2014). https://doi.org/10.1007/s12599-014-0334-4

    Article  Google Scholar 

  20. Löning, M., Bagnall, A., Ganesh, S., Kazakov, V., Lines, J., Király, F.J.: sktime: a unified interface for machine learning with time series. In: Workshop on Systems for ML at NeurIPS 2019 (2019)

    Google Scholar 

  21. Madrid, F., Singh, S., Chesnais, Q., Mauck, K., Keogh, E.: Matrix profile XVI: efficient and effective labeling of massive time series archives. In: 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 463–472 (2019). https://doi.org/10.1109/DSAA.2019.00061

  22. MDN web Docs (2022). https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date. Accessed 24 Aug 2022

  23. Meyes, R., Donauer, J., Schmeing, A., Meisen, T.: A recurrent neural network architecture for failure prediction in deep drawing sensory time series data. Procedia Manuf. 34, 789–797 (2019). https://doi.org/10.1016/j.promfg.2019.06.205

    Article  Google Scholar 

  24. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  25. Peng, F., Luo, Q., Ni, L.M.: ACTS: an active learning method for time series classification. In: 2017 IEEE 33rd International Conference on Data Engineering (ICDE), pp. 175–178 (2017). https://doi.org/10.1109/ICDE.2017.68

  26. Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, pp. 262–270. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2339530.2339576

  27. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Sig. Process. 26(1), 43–49 (1978). https://doi.org/10.1109/TASSP.1978.1163055

    Article  MATH  Google Scholar 

  28. Saund, E., Lin, J., Sarkar, P.: PixLabeler: user interface for pixel-level labeling of elements in document images. In: 2009 10th International Conference on Document Analysis and Recognition, pp. 646–650 (2009). https://doi.org/10.1109/ICDAR.2009.250

  29. Shneiderman, B., Plaisant, C., Cohen, M.S., Jacobs, S., Elmqvist, N., Diakopoulos, N.: Designing the User Interface: Strategies for Effective Human-Computer Interaction. Pearson (2016)

    Google Scholar 

  30. Souza, V.M., Rossi, R.G., Batista, G.E., Rezende, S.O.: Unsupervised active learning techniques for labeling training sets: an experimental evaluation on sequential data. Intell. Data Anal. 21(5), 1061–1095 (2017). https://doi.org/10.3233/IDA-163075

    Article  Google Scholar 

  31. microsoft/taganomaly (2021). https://github.com/Microsoft/TagAnomaly. Accessed 01 Nov 2021

  32. Tercan, H., Meisen, T.: Machine learning and deep learning based predictive quality in manufacturing: a systematic review. J. Intell. Manuf. (2022). https://doi.org/10.1007/s10845-022-01963-8

    Article  Google Scholar 

  33. Timescaledb (2021). https://docs.timescale.com/. Accessed 01 Nov 2021

  34. Tkachenko, M., Malyuk, M., Shevchenko, N., Holmanyuk, A., Liubimov, N.: Label studio: data labeling software (2020–2021). https://github.com/heartexlabs/label-studio

  35. Walker, J.S., et al.: TimeClassifier: a visual analytic system for the classification of multi-dimensional time series data. Vis. Comput. 31(4), 1067–1078 (2015). https://doi.org/10.1007/s00371-015-1112-0

    Article  Google Scholar 

  36. Zhao, N., Zhu, J., Liu, R., Liu, D., Zhang, M., Pei, D.: Label-Less: a semi-automatic labelling tool for KPI anomalies. In: IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, pp. 1882–1890 (2019). https://doi.org/10.1109/INFOCOM.2019.8737429

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tristan Langer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Langer, T., Welbers, V., Hahn, Y., Wönkhaus, M., Meyes, R., Meisen, T. (2023). Visual Interactive Exploration and Labeling of Large Volumes of Industrial Time Series Data. In: Filipe, J., Śmiałek, M., Brodsky, A., Hammoudi, S. (eds) Enterprise Information Systems. ICEIS 2022. Lecture Notes in Business Information Processing, vol 487. Springer, Cham. https://doi.org/10.1007/978-3-031-39386-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39386-0_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39385-3

  • Online ISBN: 978-3-031-39386-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics