Abstract
In many real world applications, systematic analysis of rare events, such as credit card frauds and adverse drug reactions, is very important. Their low occurrence rate in large databases often makes it difficult to identify the risk factors from straightforward application of associations and sequential pattern discovery. In this paper we introduce a heuristic to guide the search for interesting patterns associated with rare events from large temporal event sequences. Our approach combines association and sequential pattern discovery with a measure of risk borrowed from epidemiology to assess the interestingness of the discovered patterns. In the experiments, we successfully identify a known drug and several new drug combinations with high risk of adverse reactions. The approach is also applicable to other applications where rare events are of primary interest.
The authors acknowledge the valuable comments from their colleagues, including C. Carter, R. Baxter, R. Sparks, and C. Kelman, as well as the anonymous reviewers. The authors also acknowledge the Commonwealth Department of Health and Ageing, and the Queensland Department of Health for providing data for this research.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Armitage, P., Berry, G., Matthews, J.N.S.: Statistical Methods in Medical Research, 4th edn. Blackwell Science Inc., Malden (2002)
Bay, S.D., Pazzani, M.J.: Detecting change in categorical data: Mining contrast sets. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 302–306 (1999)
Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, August 1999, pp. 43–52 (1999)
Gu, L., Li, J., He, H., Williams, G., Hawkins, S., Kelman, C.: Association rule discovery with unbalanced class. In: Gedeon, T(T.) D., Fung, L.C.C. (eds.) AI 2003. LNCS (LNAI), vol. 2903, pp. 221–232. Springer, Heidelberg (2003)
Seno, M., Karypis, G.: SLPMiner: An algorithm for finding frequent sequential patterns using length decreasing support constraint. In: Proceedings of the 2nd IEEE International Conference on Data Mining (ICDM), Maebashi City, Japan, December 2002, pp. 418–425. IEEE, Los Alamitos (2002)
Webb, G.I.: Efficient search for association rules. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 99–107 (2000)
Williams, G., Vickers, D., Baxter, R., Hawkins, S., Kelman, C., Solon, R., He, H., Gu, L.: The Queensland Linked Data Set. Technical Report CMIS 02/21, CSIRO Mathematical and Information Sciences, Canberra (2002)
Wong, W.-K., Moore, A., Cooper, G., Wagner, M.: WSARE: What’s strange about recent events? Journal of Urban Health 80(2), i66–i75 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, J., He, H., Williams, G., Jin, H. (2004). Temporal Sequence Associations for Rare Events. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_30
Download citation
DOI: https://doi.org/10.1007/978-3-540-24775-3_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive