Abstract
We have been developing signature-based methods in the telecommunications industry for the past 5 years. In this paper, we describe our work as it evolved due to improvements in technology and our aggressive attitude toward scale. We discuss the types of features that our signatures contain, nuances of how these are updated through time, our treatment of outliers, and the trade-off between time-driven and event-driven processing. We provide a number of examples, all drawn from the application of signatures to toll fraud detection.
Similar content being viewed by others
References
Burge, P. and Shawe-Taylor, J. 1997. Detecting cellular fraud using adaptive prototypes. Proceedings of the AAAI-97 Workshop on AI Approaches to Fraud Detection and Risk Management. Menlo Park, CA: AAAI Press, pp. 9–13.
Denning, D.E. 1987. An intrusion-detection model. IEEE Transactions on Software Engineering, 13: 222–232.
Fawcett, T. and Provost, F. 1997. Adaptive fraud detection. Data Mining and Knowledge Discovery, 1: 291–316.
Lunt, T.F. 1993. A survey of intrusion detection techniques. Computers & Security, 12:405–418.
Cortes, C. and Pregibon, D. 1999. An information mining platform. Proceedings of KDD99. New York: ACM Press.
DuMouchel, W. 1999. Bayesian data mining in large frequency tables. American Statistician, 53: 177–202.
Flake, G.W., Lawrence, S., and Giles, C.L. 2000. Efficient identification of web communities. Proceedings of KDD2000. New York: ACM Press, pp. 150–160.
Cortes, C., Fisher, K., Pregibon, D., Rogers, A., and Smith, F. 2000. Hancock: A language for extracting signatures from data streams. Proceedings of KDD2000, New York: ACM Press, pp. 9–17.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Cortes, C., Pregibon, D. Signature-Based Methods for Data Streams. Data Mining and Knowledge Discovery 5, 167–182 (2001). https://doi.org/10.1023/A:1011464915332
Issue Date:
DOI: https://doi.org/10.1023/A:1011464915332