Abstract
Temporal trails generated by agents traveling to various locations at different time epochs are becoming more prevalent in large social networks. We propose an algorithm to intuitively cluster groups of such agent trails from networks. The proposed algorithm is based on modeling each trail as a probabilistic finite state automata (PFSA). The algorithm also allows the specification of the required degree of similarity between the trails by specifying the depth of the PFSA. Hierarchical agglomerative clustering is used to group trails based on their representative PFSA and the locations that they visit. The algorithm was applied to simulated trails and real-world network trails obtained from merchant marine ships GPS locations. In both cases it was able to intuitively detect and extract the underlying patterns in the trails and form clusters of similar trails.
Similar content being viewed by others
References
Abbott A, Tsay A (2000) Sequence analysis and optimal matching methods in sociology: review and prospect. Sociol Methods Res 29(1):3–33
Antunes CM, Oliveira AL (2001) Temporal data mining: an overview. KDD Workshop on Temporal Data Mining, pp 1–15
Assent I, Krieger R, Glavic B, Seidl T (2008) Clustering multidimensional sequences in spatial and temporal databases. Knowl Inf Syst 16(1):29–51
Baragona R (2001) A simulation study on clustering time series with metaheuristic methods. Quad Stat 3:1–26
Börner K, Penumarthy S (2003) Social diffusion patterns in three-dimensional virtual worlds. Inf Vis 2(17):182–198
Carley KM (2004) Dynamic network analysis, In: Committee on Human Factors, National Research Council, pp 133–145
Carley KM, Reminga J (2004) Ora: organizational risk analyzer. Technical Report CMU-ISRI-04-106, Institute for Software Research International, Carnegie Mellon University
Cazabet R, Takeda H, Hamasaki M, Amblard F (2012) Using dynamic community detection to identify trends in user-generated content. Social Netw Anal Min 2(4):361–371
Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans PAMI-Pattern Analysis and Machine Intelligence 1(2):224–227
Davis G, Olson J, Carley KM (2008) OraGIS and loom: Spatial and temporal extensions to the ORA analysis platform. Technical Report CMU-ISR-08-121, Institute for Software Research International, Carnegie Mellon University
Goodchild MF (2010) Twenty years of progress: Giscience in 2010. J Spat Inf Sci 1:3–20
Hirano S, Tsumoto S (2004) Classification of temporal sequences using rough clustering. Processing NAFIPS ’04. IEEE Annual Meeting of the Fuzzy Information 2:711–716
Jain A, Dubes R (1988) Algorithms for clustering data. Prentice-Hall, Inc
Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Min Knowl Disc 7:349–371
Lane T, Brodley CE (1999) Temporal sequence learning and data reduction for anomaly detection. ACM Trans Inf Syst Secur 2(3):295–331
Li C, Biswas G (1999) Temporal pattern generation using hidden markov model based unsupervised classiffication. Advances in Intelligent Data Analysis, vol 1642 of Lecture Notes in Computer Science, Springer Berlin/Heidelberg, pp 245–256
Liao TW (2005) Clustering of time series data—a survey. Pattern Recogn Lett 38(11):1857–1874
Pena D, Tiao G, Tsay SR (2001) A course in time series analysis, Wiley Series in Probability and Statistics
Peuquet DJ (2001) Making space for time: Issues in space-time data representation. GeoInformatica 5:11–32
Poornalatha G, Prakash SR (2012) Web sessions clustering using hybrid sequence alignment measure (HSAM). Social network analysis and mining 1869–5450, pp 1–12. http://link.springer.com/article/10.1007%2Fs13278-012-0070-z?LI=true
Rajagopalan V, Ray A (2006) Symbolic time series analysis via wavelet-based partitioning. Signal Process 86(11):3309–3320
Ramoni M, Sebastiani P, Cohen P (2002) Bayesian clustering by dynamics. Mach Learn 47:91–121
Ray A (2004) Symbolic dynamic analysis of complex systems for anomaly detection. Signal Process 84(7):1115–1130
Roddick JF, Spiliopoulou (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 14:750–767
Rosenberg A, Hirschberg J (2007) V-measure: a conditional entropy-based external cluster evaluation measure. In: EMNLP-CoNLL’07, pp 410–420
Saul LK, Jordan MI (1999) Mixed memory markov models: Decomposing complex stochastic processes as mixtures of simpler ones. Mach Learn 37:75–87
Schmiedekamp M, Subbu A, Phoha S (2006) The clustered causal state algorithm: Efficient pattern discovery for lossy data-compression applications. Comput Sci Eng 8(5):59–67
Shalizi CR, Shalizi KL, Crutchfield JP (2002) An algorithm for pattern discovery in time series. Technical Report 02-10-060, Santa Fe Institute, arxiv.org/abs/cs.LG/0210025
Smyth P (1999) Probabilistic model-based clustering of multivariate and sequential data. In: Proceedings of Artificial Intelligence and Statistics, Morgan Kaufmann, Los Altos, pp 299–304
Subbu A, Ray A (2008) Space partitioning via Hilbert transform for symbolic time series analysis. Appl Phys Lett 92(8):084107–084107-3
Wang L, Mehrabi MG, Kannatey-Asibu E (2002) Hidden markov model-based tool wear monitoring in turning. J Manuf Sci Eng 124(3):651–658
Wasserman S, Faust K (1994) Social network analysis. Cambridge University Press, Cambridge
Acknowledgments
This work was supported in part by the Office of Naval Research (N00014-06-1-0104) for adversarial assessment and (N00014-08-11186) for rapid ethnographic assessment, the Army Research Office and ERDC-TEC (W911NF0710317). Additional support was provided by CASOS—the center for Computational Analysis of Social and Organizational Systems at Carnegie Mellon University. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Office of Naval Research, the Army Research Institute, the US Army Engineer Research and Development Centers (ERDC), Topographic Engineering Center or the US government.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gullapalli, A., Carley, K.M. Extracting ordinal temporal trail clusters in networks using symbolic time-series analysis. Soc. Netw. Anal. Min. 3, 1179–1194 (2013). https://doi.org/10.1007/s13278-012-0091-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13278-012-0091-7