Skip to main content

Variable-Chromosome-Length Genetic Algorithm for Time Series Discretization

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9828))

Included in the following conference series:

Abstract

The symbolic aggregate approximation method (SAX) of time series is a widely-known dimensionality reduction technique of time series data. SAX assumes that normalized time series have a high-Gaussian distribution. Based on this assumption SAX uses statistical lookup tables to determine the locations of the breakpoints on which SAX is based. In a previous work, we showed how this assumption oversimplifies the problem, which may result in high classification errors. We proposed an alternative approach, based on the genetic algorithms, to determine the locations of the breakpoints. We also showed how this alternative approach boosts the performance of the original SAX. However, the method we presented has the same drawback that existed in the original SAX; it was only able to determine the locations of the breakpoints but not the corresponding alphabet size, which had to be input by the user in the original SAX. In the method we previously presented we had to run the optimization process as many times as the range of the alphabet size. Besides, performing the optimization process in two steps can cause overfitting. The novelty of the present work is twofold; first, we extend a version of the genetic algorithms that uses chromosomes of different lengths. Second, we apply this new version of variable-chromosome-length genetic algorithm to the problem at hand to simultaneously determine the number of the breakpoints, together with their locations, so that the optimization process is run only once. This speeds up the training stage and also avoids overfitting. The experiments we conducted on a variety of datasets give promising results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. J. Knowl. Inf. Syst. 3(3), 263–286 (2000)

    Article  MATH  Google Scholar 

  2. Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26th International Conference on Very Large Databases, Cairo, Egypt (2000)

    Google Scholar 

  3. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Locally adaptive dimensionality reduction for similarity search in large time series databases. In: SIGMOD (2001)

    Google Scholar 

  4. Lin, J., Keogh, E., Lonardi, S., Chiu, B.Y.: A symbolic representation of time series, with implications for streaming algorithms. DMKD 2003, 2–11 (2003)

    Article  Google Scholar 

  5. Muhammad Fuad, M.M., Marteau, P.-F.: Enhancing the symbolic aggregate approximation method using updated lookup tables. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.) KES 2010, Part I. LNCS, vol. 6276, pp. 420–431. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  6. Shieh, J., Keogh, E.: iSAX: disk-aware mining and indexing of massive time series datasets. Data Min. Knowl. Discov. 19(1), 24–57 (2009)

    Article  MathSciNet  Google Scholar 

  7. Muhammad Fuad, M.M.: Genetic algorithms-based symbolic aggregate approximation. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2012. LNCS, vol. 7448, pp. 105–116. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  8. Muhammad Fuad, M.M.: One-step or two-step optimization and the overfitting phenomenon: a case study on time series classification. In: The 6th International Conference on Agents and Artificial Intelligence- ICAART 2014, 6–8 March 2014, Angers, France. SCITEPRESS Digital Library (2014)

    Google Scholar 

  9. Smith, S.F.: A Learning System Based on Genetic Adaptive Algorithms. Doctoral dissertation, Department of Computer Science, University of Pittsburgh, PA (1980)

    Google Scholar 

  10. Kim, L.Y., Weck, O.L.: Variable chromosome length genetic algorithm for progressive refinement in topology optimization. Struct. Multidisciplinary Optim. 29(6), 445–456 (2005)

    Article  Google Scholar 

  11. Brié, A.H., Morignot, P.: Genetic planning using variable length chromosomes. In: Proceedings of the 15th International Conference on Automated Planning and Scheduling (2005)

    Google Scholar 

  12. Chen,Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G.: The UCR Time Series Classification Archive (2015). www.cs.ucr.edu/~eamonn/time_series_data

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Marwan Muhammad Fuad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Muhammad Fuad, M.M. (2016). Variable-Chromosome-Length Genetic Algorithm for Time Series Discretization. In: Hartmann, S., Ma, H. (eds) Database and Expert Systems Applications. DEXA 2016. Lecture Notes in Computer Science(), vol 9828. Springer, Cham. https://doi.org/10.1007/978-3-319-44406-2_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44406-2_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44405-5

  • Online ISBN: 978-3-319-44406-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics