Abstract
The underlying theory of symbolic time series analysis (STSA) has led to the development of signal representation tools in the paradigm of dynamic data-driven application systems (DDDAS), where time series of sensor signals are partitioned to obtain symbol strings that, in turn, lead to the construction of probabilistic finite state automata (PFSA). Although various methods for construction of PFSA from symbol strings have been reported in literature, similar efforts have not been expended on identification of an appropriate alphabet size for partitioning of time series, so that the symbol strings can be optimally or suboptimally generated in a specified sense. The paper addresses this critical issue and proposes an information-theoretic procedure for partitioning of time series to extract low-dimensional features, where the key idea is suboptimal identification of boundary locations of the partitioning segments via maximization of the mutual information between the state probability vector of PFSA and the members of the pattern classes. Robustness of the symbolization process has also been addressed. The proposed alphabet size selection and time series partitioning algorithm have been validated by two examples. The first example addresses parameter identification in a simulated Duffing system with sinusoidal input excitation. The second example is built upon an ensemble of time series of chemiluminescence data to predict lean blowout (LBO) phenomena in a laboratory-scale swirl-stabilized combustor apparatus.





Similar content being viewed by others
References
Beim Graben, P.: Estimating and improving the signal-to-noise ratio of time series by symbolic dynamics. Phys. Rev. E 64(5), 051104 (2001)
Daw, C., Fenney, C., Tracy, E.: A review of symbolic analysis of experimental data. Rev. Sci. Instrum. 74, 915–930 (2003)
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing sax: a novel symbolic representation of time series. Data Min. Knowl. Discov. (2007). doi:10.1007/s10618-007-0064-z
Lind, D., Marcus, B.: An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, Cambridge (1995)
Ray, A.: Symbolic dynamic analysis of complex systems for anomaly detection. Signal Process. 84(7), 1115–1130 (2004)
Rajagopalan, V., Ray, A.: Symbolic time series analysis via wavelet-based partitioning. Signal Process. 86(11), 3309–3320 (2006)
Subbu, A., Ray, A.: Space partitioning via Hilbert transform for symbolic time series analysis. Appl. Phys. Lett. 92(8), 084107 (2008)
Mukherjee, K., Ray, A.: State splitting and merging in probabilistic finite state automata for signal representation and analysis. Signal Process. 104, 105–119 (2014)
Darema, F.: Dynamic data driven applications systems: new capabilities for application simulations and measurements. In: 5th International Conference on Computational Science - ICCS 2005, (Atlanta, GA, USA), (2005)
Rao, C., Ray, A., Sarkar, S., Yasar, M.: Review and comparative evaluation of symbolic dynamic filtering for detection of anomaly patterns. Signal Image Video Process. 3(2), 101–114 (2009)
Bahrampour, S., Ray, A., Sarkar, S., Damarla, T., Nasrabadi, N.: Performance comparison of feature extraction algorithms for target detection and classification. Pattern Recognt. Lett. 34, 2126–2134 (2013)
Dupont, P., Denis, F., Esposito, Y.: Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms. Pattern Recognit. 38(9), 1349–1371 (2005)
Buhl, M., Kennel, M.: Statistically relaxing to generating partitions for observed time-series data. Phys. Rev. E 71(4), 046213 (2005)
Sarkar, S., Mukherjee, K., Jin, X., Singh, D., Ray, A.: Optimization of symbolic feature extraction for pattern classification. Signal Process. 92(3), 625–635 (2012)
Sarkar, S., Chattopadhyay, P., Ray, A., Phoha, S., Levi, M.: Alphabet size selection for symbolization of dynamic data-driven systems: an information-theoretic approach. In: 2015 American Control Conference (ACC), (Chicago, OH, USA), pp. 5194–5199, July 1–3 (2015)
Cover, T., Thomas, J.: Elements of Information Theory, 2nd edn. Wiley, Hoboken, NJ, USA (2006)
Steuer, R., Molgedey, L., Ebeling, W., Jimenez-Montano, M.: Entropy and optimal partition for data analysis. Eur. Phys. J. B 19, 265–269 (2001)
Jin, X., Gupta, S., Mukherjee, K., Ray, A.: Wavelet-based feature extraction using probabilistic finite state automata for pattern classification. Pattern Recognit. 44(7), 1343–1356 (2011)
Kwak, N., Choi, C.: Input feature selection by mutual information based on parzen window. IEEE Trans. Pattern Anal. Mach. Learn. 24(12), 1667–1671 (2002)
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33, 1065–1076 (1962)
Bishop, C.M.: Pattern Recognit. Mach. Learn. Springer, New York (2006)
Sarkar, S., Ray, A., Mukhopadhyay, A., Sen, S.: Dynamic data-driven prediction of lean blowout in a swirl-stabilized combustor. Int. J. Spray Combust. Dyn. 7(3), 209–242 (2015)
Thompson, J., Stewart, H.: Nonlinear Dynamics and Chaos. Wiley, Chichester (1986)
Acknowledgments
The work reported in this paper has been supported in part by the US Air Force Office of Scientific Research (AFOSR) under Grant No. FA9550-15-1-0400.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sarkar, S., Chattopdhyay, P. & Ray, A. Symbolization of dynamic data-driven systems for signal representation. SIViP 10, 1535–1542 (2016). https://doi.org/10.1007/s11760-016-0967-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-016-0967-5