Skip to main content

Optimized Word-Size Time Series Representation Method Using a Genetic Algorithm with a Flexible Encoding Scheme

  • Conference paper
  • First Online:
AI*IA 2016 Advances in Artificial Intelligence (AI*IA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10037))

Included in the following conference series:

  • 1272 Accesses

Abstract

Performing time series mining tasks directly on raw data is inefficient, therefore these data require representation methods that transform them into low-dimension spaces where they can be managed more efficiently. Owing to its simplicity, the piecewise aggregate approximation is a popular time series representation method. But this method uses a uniform word-size for all the segments in the time series, which reduces the quality of the representation. Although some alternatives use representations with different word-sizes in a way that reflects the various information contents of different segments, such methods apply a complicated representation scheme, as it uses a different representation for each time series in the dataset. In this paper we present two modifications of the original piecewise aggregate approximation. The novelty of these modifications is that they use different word-sizes, which allows for a flexible representation that reflects the level of activity in each segment, yet these new medications address this problem on a dataset-level, which simplifies establishing a lower bounding distance. The word-sizes are determined through an optimization process. The experiments we conducted on a variety of time series datasets validate the two new modifications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Larose, D.T.: Discovering Knowledge in Data: An Introduction To Data Mining. Wiley, New York (2005)

    MATH  Google Scholar 

  2. Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. (CSUR) 45(1), 12 (2012)

    Article  MATH  Google Scholar 

  3. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Proceedings of the 4th Conference on Foundations of Data Organization and Algorithms (1993)

    Google Scholar 

  4. Agrawal, R., Lin, K.I., Sawhney, H.S., Shim, K.: Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In: Proceedings of the 21st International Conference on Very Large Databases, Zurich, Switzerland, pp. 490–501 (1995)

    Google Scholar 

  5. Chan, K.P., Fu, A.W-C.: Efficient time series matching by wavelets. In: Proceedings of 15th International Conference on Data Engineering (1999)

    Google Scholar 

  6. Korn, F., Jagadish, H., Faloutsos, C.: Efficiently supporting ad hoc queries in large datasets of time sequences. In: Proceedings of SIGMOD 1997, Tucson, AZ, pp. 289–300 (1997)

    Google Scholar 

  7. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Locally adaptive dimensionality reduction for similarity search in large time series databases. In: SIGMOD, pp 151–162 (2001)

    Google Scholar 

  8. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. J. Know. Inform. Syst. 3, 263–286 (2000)

    Article  MATH  Google Scholar 

  9. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26th International Conference on Very Large Databases, Cairo, Egypt (2000)

    Google Scholar 

  10. Morinaka, Y., Yoshikawa, M., Amagasa, T., Uemura, S.: The L-index: an indexing structure for efficient subsequence matching in time sequence databases. In: Proceedings of 5th Pacific Asia Conference on Knowledge Discovery and Data Mining, pp. 51–60 (2001)

    Google Scholar 

  11. Cai, Y., Ng, R.: Indexing spatio-temporal trajectories with Chebyshev polynomials. In: SIGMOD (2004)

    Google Scholar 

  12. Muhammad Fuad, M.M.: Differential evolution-based weighted combination of distance metrics for k-means clustering. In: Dediu, A.-H., Lozano, M., Martín-Vide, C. (eds.) TPNC 2014. LNCS, vol. 8890, pp. 193–204. Springer, Heidelberg (2014). doi:10.1007/978-3-319-13749-0_17

    Google Scholar 

  13. Muhammad Fuad, M.M.: Differential evolution versus genetic algorithms: towards symbolic aggregate approximation of non-normalized time series. In: Sixteenth International Database Engineering & Applications Symposium, IDEAS 2012, Prague, Czech Republic, 8–10 August, 2012. BytePress/ACM (2012)

    Google Scholar 

  14. Muhammad Fuad, M.M.: Multi-objective optimization for clustering microarray gene expression data - a comparative study. In: Jezic, G., Howlett, R.J., Jain, L.C. (eds.). SIST, vol. 38, pp. 123–133Springer, Heidelberg (2015). doi:10.1007/978-3-319-19728-9_10

    Chapter  Google Scholar 

  15. Bramer, M.: Principles of Data Mining. Springer, London (2007)

    MATH  Google Scholar 

  16. Gorunescu, F.: Data mining: Concepts, Models and Techniques. Blue Publishing House, Cluj-Napoca (2006)

    MATH  Google Scholar 

  17. Mörchen, F.: Time series knowledge mining. Ph.D. thesis, Philipps-University Marburg, Germany, Görich & Weiershäuser, Marburg, Germany (2006)

    Google Scholar 

  18. Maimon, O., Rokach, L.: Data Mining and Knowledge Discovery Handbook. Springer, New York (2005)

    Book  MATH  Google Scholar 

  19. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proceedings of ACM SIGMOD Conference of Minneapolis (1994)

    Google Scholar 

  20. Muhammad Fuad, M.M., Marteau P.F.: Multi-resolution approach to time series retrieval. In: Fourteenth International Database Engineering & Applications Symposium, IDEAS 2010, Montreal, QC, Canada (2010)

    Google Scholar 

  21. Muhammad Fuad, M.M., Marteau P.F.: Speeding-up the similarity search in time series databases by coupling dimensionality reduction techniques with a fast-and-dirty filter. In: Fourth IEEE International Conference on Semantic Computing, ICSC 2010, Carnegie Mellon University, Pittsburgh, PA, USA (2010)

    Google Scholar 

  22. Muhammad Fuad, M.M., Marteau, P.F.: Fast retrieval of time series by combining a multiresolution filter with a representation technique. In: The International Conference on Advanced Data Mining and Applications, ADMA 2010, ChongQing, China, November 21 (2010)

    Google Scholar 

  23. Smith, S. F.: A learning system based on genetic adaptive algorithms. Doctoral dissertation, Department of Computer Science, University of Pittsburgh, PA (1980)

    Google Scholar 

  24. Chen,Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G.: The UCR time series classification archive (2015). www.cs.ucr.edu/~eamonn/time_series_data

  25. Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proceedings of the 34th VLDB (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Marwan Muhammad Fuad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Muhammad Fuad, M.M. (2016). Optimized Word-Size Time Series Representation Method Using a Genetic Algorithm with a Flexible Encoding Scheme. In: Adorni, G., Cagnoni, S., Gori, M., Maratea, M. (eds) AI*IA 2016 Advances in Artificial Intelligence. AI*IA 2016. Lecture Notes in Computer Science(), vol 10037. Springer, Cham. https://doi.org/10.1007/978-3-319-49130-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49130-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49129-5

  • Online ISBN: 978-3-319-49130-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics