Abstract
Sequential pattern and rule mining have been the focus of much research, however predicting missing sets of elements within a sequence remains a challenge. Recent work in survey design suggests that if these missing elements can be inferred with a higher degree of certainty, it could greatly reduce the time burden on survey participants. To address this problem and the more general problem of missing sensor data, we introduce a new form of constrained sequential rules that use attribute presence to better capture rule confidence in sequences with missing data than previous constraint based techniques. Specifically we examine the problem of given a partially labeled sequence of sets, how well can the missing attributes be inferred. Our study shows this technique significantly improves prediction robustness when even large amounts of data are missing compared to traditional techniques.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Yu, P.S., Chen, A.S.P. (eds.) Eleventh International Conference on Data Engineering, Taipei, Taiwan, pp. 3–14. IEEE Computer Society Press, Los Alamitos (1995)
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery 15(1), 55–86 (2007)
Yang, Q., Zhang, H.H., Li, T.: Mining web logs for prediction models in www caching and prefetching. In: KDD ’01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 473–478. ACM, New York (2001)
Mobasher, B., Dai, H., Luo, T., Nakagawa, M.: Using sequential and non-sequential patterns in predictive web usage mining tasks. In: ICDM 2002: Proceedings of the 2002 IEEE International Conference on Data Mining, Washington, DC, USA, p. 669. IEEE Computer Society Press, Los Alamitos (2002)
North, R., Richards, M., Cohen, J., Hoose, N., Hassard, J., Polak, J.: A mobile environmental sensing system to manage transportation and urban air quality. In: IEEE International Symposium on Circuits and Systems, 2008. ISCAS 2008, May 2008, pp. 1994–1997 (2008)
Marca, J.E., Rindt, C.R., McNally, M.G.: Collecting activity data from gps readings. Technical Report Paper UCI-ITS-AS-WP-02-3, Institute of Transportation Studies, Center for Activity Systems Analysis, University of California, Irvine (July 2002)
Auld, J., Williams, C.A., Mohammadian, A., Nelson, P.C.: An automated GPS-based prompted recall survey with learning algorithms. Transportation Letters: The International Journal of Transportation Research 1(1), 59–79 (2009)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. SIGMOD Rec. 22(2), 207–216 (1993)
Garofalakis, M.N., Rastogi, R., Shim, K.: Spirit: Sequential pattern mining with regular expression constraints. In: VLDB 1999: Proceedings of the 25th International Conference on Very Large Data Bases, pp. 223–234. Morgan Kaufmann Publishers Inc., San Francisco (1999)
Garofalakis, M., Rastogi, R., Shim, K.: Mining sequential patterns with regular expression constraints. IEEE Transactions on Knowledge and Data Engineering 14(3), 530–552 (2002)
Pei, J., Han, J., Wang, W.: Mining sequential patterns with constraints in large databases. In: CIKM 2002: Proceedings of the eleventh international conference on Information and knowledge management, pp. 18–25. ACM, New York (2002)
Liu, B., Hu, M., Cheng, J.: Opinion observer: analyzing and comparing opinions on the web. In: WWW 2005: Proceedings of the 14th international conference on World Wide Web, pp. 342–351. ACM Press, New York (2005)
Liu, B.: Web data mining: exploring hyperlinks, contents, and usage data. In: Data-Centric Systems and Applications. Springer, Heidelberg (2007)
Cleverdon, C.: Evaluation of tests of information retrieval systems. Journal of Documentation 26, 55–67 (1970)
van Rijsbergen, C.: Information Retrieval. Butterworth, London (1979)
NuStats: 2001 atlanta household travel survey: Final report. Technical report, Atlanta Regional Commision (April 2003)
Timmermans, H. (ed.): Progress in Activity-Based Analysis. Elsevier, Oxford (2005)
Ettema, D., Schwanen, T., Timmermans, H.: The effect of location, mobility and socio-demographic factors on task and time allocation of households. Transportation: Planning, Policy, Research, Practice 34(1) (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Williams, C.A., Nelson, P.C., Mohammadian, A.(. (2009). Attribute Constrained Rules for Partially Labeled Sequence Completion. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2009. Lecture Notes in Computer Science(), vol 5633. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03067-3_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-03067-3_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03066-6
Online ISBN: 978-3-642-03067-3
eBook Packages: Computer ScienceComputer Science (R0)