Clustering ellipses for anomaly detection
Section snippets
Introduction: clustering and ellipsoids
Hyperellipsoids (more simply, ellipsoids) occur in many areas of applied mathematics. For example, level sets of Gaussian probability densities are ellipsoids [1]. Ellipsoids also appear often in clustering [2], [3], [4] and classifier design [1], [5], [6], [7]. Please be careful to distinguish the present work, wherein the input data objects are ellipsoids, from clustering algorithms such as that of Dave and Patel [4], where the output of clustering input sets of object vectors in p-space
Similarity measures for pairs of ellipsoids
Let vectors , and let be positive definite. The quadratic form is also positive definite, and for fixed , the level set of , for scalar t2>0, is
Geometrically, is the (surface of the) hyper-ellipsoid in p-space induced by A, all of whose points are the constant A distance (t) from its center m. Sometimes t is called the “effective radius” of . When A=Ip, is the surface of a hyper-sphere
Compound similarity
We use subscripts 1 and 2 for a pair of ellipsoids. Our first measure of similarity is a compound measure, the product of three exponential factors that satisfies requirements (2a), (2b), (3), (4) for strong similarity. This measure of similarity is built by considering the location, orientation, and shape of an ellipse. The geometric rationale and limit behavior of each factor are discussed next.
Location: Positional similarity for (E1, E2) is a function of their mean separation, i.e., the
Transformation energy similarity
Consider each ellipsoid as having its own space spanned by its eigenvector basis with origin at its center. We can construct a function that maps a point from one ellipsoid space to another via the common space between them. A point in the space of ellipsoid Ei can be mapped to the common co-ordinate space by scaling the point by , reversing the rotation by , then shifting the point away from the origin by translation by mi. Within this common space the point can then be mapped into the
Focal similarity
Our third measure of similarity begins by recalling that every plane ellipse can be constructed by tracing the curve whose distance from a pair of foci f1 and f2 is some positive constant c(t), which depends on the effective radius t.
This construction is shown for a two-dimensional ellipse in Fig. 4, with effective radius t so that p(t)+q(t)=c(t) for the ellipse E(A,m; t). The foci always lie along the major axis of the ellipse, which is the linear span of the eigenvector of A corresponding to
Tendency assessment with VAT and iVAT
Our aim is to use similarity and dissimilarity measures and their iVAT images to find clusters in sets of ellipsoids. Before considering this specific problem, we introduce some concepts from clustering theory that are needed to proceed with our objectives. Clustering is the problem of partitioning a set of unlabeled objects O={o1, …, on} into groups of similar objects [1], [7], [16], [17], [18], [19], [20]. The field comprises three canonical problems (CPs). (CP1) is assessment: prior to finding
Tendency assessment for sets of ellipsoids
Let E denote n ellipsoids in p-space, E={E1, E2, …, En}. For (Ei,▒ Ej)∈E×E, compute s∗,ij=s(Ei,▒ Ej) with any of our three measures of similarity, and array these n2 values as the n×n similarity relation matrix S∗=[s∗,ij]. The transformation D∗=[d∗,ij]=[1−s∗,ij] yields a dissimilarity relation on E×E. (Actually, we need not do this for the focal measure, as it is, by definition, a dissimilarity measure already.) Applying the iVAT algorithm to D∗ will yield an RDI that can be used to assess
Finding clusters in sets of ellipsoids
Looking for clusters in E raises two questions. First, before clustering, we must ask how many clusters to look for? Second, after clustering, how much credence shall we put on the “optimal” partition of the data? The iVAT images of Section 7 offer visual suggestions for value(s) of c in each of our three test sets prior to clustering. There are many, many other ways to estimate c prior to clustering. The second pre-clustering approach tested here is based on the eigenvalues of D. Ferenc proved
Conclusions and discussion
First, we defined and analyzed three measures of similarity for pairs of hyperellipsoids in p-space. Then we introduced a way to visually assess cluster substructure in sets of ellipses using the recursive iVAT algorithm to reorder dissimilarity data set D→D′*. The reordered image I(D′*) shows clustering tendencies in the objects underlying D as dark sub-blocks along the main diagonal. We introduced a second pre-clustering assessment method based on the ordered eigenvalues (OEVs) of D. Our
Masud Moshtaghi received his B.Sc. degree in 2006 in computer science, and his M.S. in software engineering in 2008 from the University of Tehran. He has been with the University of Melbourne from March 2009. His research interests include pattern recognition, artificial intelligence for network security, data mining, and wireless sensor networks.
References (40)
- et al.
Ellipsoidal decision regions for motif-based patterned fabric defect detection
Pattern Recog.
(2010) - et al.
Scalable visual assessment of cluster tendency for large data sets
Pattern Recogn.
(2006) - et al.
Applied Multivariate Statistical Analysis
(1992) - et al.
An adaptive algorithm for modifying hyperellipsoidal decision surfaces
J. Artif. Neural Networks
(1994) An algorithm for merging hyperellipsoidal clusters (1994). TR LA-UR-94-3306
(1994)- R.N. Davé, K.J. Patel, Fuzzy ellipsoidal-shell clustering algorithm and detection of elliptical shapes, in: D.P....
- et al.
Identification of fuzzy prediction models through hyperellipsoidal clustering
IEEE Trans. Syst. Man Cybernet.
(1994) - et al.
Fuzzy function learning with covariance ellipsoids
in: Proceedings of the IEEE International Conference on Neural Networks
(1993) - et al.
Pattern Classification and Scene Analysis
(1973) - S. Rajasegarar, C. Leckie, M. Palaniswami, CESVM: centered hyperellipsoidal support vector machine based anomaly...
Elliptical anomalies in wireless sensor networks
ACM TOSN
Pattern Recognition
Fuzzy Models and Algorithms for Pattern Recognition and Image Processing
Algorithms for Clustering Data
Clustering Algorithms
Cited by (90)
A new efficient method for solving the multiple ellipse detection problem
2023, Expert Systems with ApplicationsAnomaly Detection For Time Series Data Based on Multi-granularity Neighbor Residual Network
2022, International Journal of Cognitive Computing in EngineeringCitation Excerpt :However, the methods are not effective for anomaly detection when facing to different density distribution regions. Anomaly detection based on clusters is first to maximize the distance between different clusters and minimize the distance among objects in the same clusters, and then identify the objects that are not in the clusters as anomaly nodes (Moshtaghi et al., 2011). However, the number of clusters has a direct influence on the identification of outliers.
LRGAN: Visual anomaly detection using GAN with locality-preferred recoding
2021, Journal of Visual Communication and Image RepresentationCitation Excerpt :In a broad sense, anomaly detection models aim to find data patterns that do not conform to expected behaviors [5]. Since it is hard or even infeasible to obtain massive labeled anomalies, traditional anomaly detection tasks are often accomplished by one-class support vector machine (SVM) based methods [6,7] or clustering based methods [8,9]. However, these models are not specially optimized for anomaly detection and consequently easy to yield false alarms or misdetections [10].
Data anomaly identification method based on local outlier factor and application in monitoring data of heritage building structure
2022, Jianzhu Jiegou Xuebao/Journal of Building StructuresElementary Cluster Analysis: Four Basic Methods that (Usually) Work
2022, Elementary Cluster Analysis: Four Basic Methods that (Usually) WorkBig data clustering using Improvised Fuzzy C-Means clustering
2021, Revue d'Intelligence Artificielle
Masud Moshtaghi received his B.Sc. degree in 2006 in computer science, and his M.S. in software engineering in 2008 from the University of Tehran. He has been with the University of Melbourne from March 2009. His research interests include pattern recognition, artificial intelligence for network security, data mining, and wireless sensor networks.
Timothy C. Havens received his M.S. degree in electrical engineering from Michigan Tech University in 2000. After that, he was employed at MIT Lincoln Laboratory where he specialized in the simulation and modelling of directed energy and global positioning systems. In 2006, he began work on his Ph.D. degree in electrical and computer engineering at the University of Missouri. His interests include clustering in relational data and ontologies, fuzzy logic, and bioinformatics but, by night, he is a jazz bassist.
James C. Bezdek received his Ph.D. in Applied Mathematics from the Cornell University in 1973. Jim is the past president of NAFIPS (North American Fuzzy Information Processing Society), IFSA (International Fuzzy Systems Association) and the IEEE CIS (Computational Intelligence Society): founding editor of the Int'l. Jo. Approximate Reasoning and the IEEE Transactions on Fuzzy Systems: Life fellow of the IEEE and IFSA; and a recipient of the IEEE 3rd Millennium, IEEE CIS Fuzzy Systems Pioneer, and IEEE Technical Field Award Rosenblatt medals. Jim's interests: woodworking, optimization, motorcycles, pattern recognition, cigars, clustering in very large data, fishing, co-clustering, blues music, wireless sensor networks, poker, and visual clustering. Jim retired in 2007, and will be coming to a university near you soon.
Laurence Park received his B.E. (Hons.) and B.Sc. degrees from the University of Melbourne, Australia in 2000 and Ph.D. degree from the University of Melbourne in 2004. He joined the Computer Science Department at the University of Melbourne as a Research Fellow in 2004, and was promoted to Senior Research Fellow in 2008. Laurence joined the School of Computing and Mathematics at the University of Western Sydney as a Lecturer in Computational Mathematics and Statistics in 2009, where he is currently investigating methods of large scale data mining and machine learning. During this time, Laurence has been made an Honorary Senor Fellow of the University of Melbourne.
Christopher Leckie is an Associate Professor and Deputy-Head of the Department of Computer Science and Software Engineering at the University of Melbourne in Australia. A/Prof. Chris Leckie has over two decades of research experience in artificial intelligence (AI), especially for problems in telecommunication networking, such as data mining and intrusion detection. A/Prof. Leckie's research into scalable methods for data mining has made significant theoretical and practical contributions in efficiently analyzing large volumes of data in resource-constrained environments, such as wireless sensor networks.
Sutharshan Rajasegarar received his B.Sc. Engineering degree in Electronic and Telecommunication Engineering (with first class honours) in 2002, from the University of Moratuwa, Sri Lanka, and his Ph.D. in 2009 from the University of Melbourne, Australia. He is currently a Research Fellow with the Department of Electrical and Electronic Engineering, The University of Melbourne, Australia. His research interests include wireless sensor networks, anomaly/outlier detection, machine learning, pattern recognition, signal processing, and wireless communication.
James M. Keller received his Ph.D. in Mathematics in 1978. He holds the University of Missouri Curators’ Professorship in the Electrical and Computer Engineering and Computer Science Departments on the Columbia campus. He is also the R. L. Tatum Professor in the College of Engineering. His research interests center on computational intelligence: fuzzy set theory and fuzzy logic, neural networks, and evolutionary computation with a focus on problems in computer vision, pattern recognition, and information fusion including bioinformatics, spatial reasoning in robotics, geospatial intelligence, sensor and information analysis in technology for eldercare, and landmine detection. His industrial and government funding sources include the Electronics and Space Corporation, Union Electric, Geo-Centers, National Science Foundation, the Administration on Aging, The National Institutes of Health, NASA/JSC, the Air Force Office of Scientific Research, the Army Research Office, the Office of Naval Research, the National Geospatial Intelligence Agency, the Leonard Wood Institute, and the Army Night Vision and Electronic Sensors Directorate. Professor Keller has coauthored over 350 technical publications. Jim is a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) for whom he has presented live and video tutorials on fuzzy logic in computer vision, is an International Fuzzy Systems Association (IFSA) Fellow, an IEEE Computational Intelligence Society Distinguished Lecturer, a national lecturer for the Association for Computing Machinery (ACM) from 1993 to 2007, and a past President of the North American Fuzzy Information Processing Society (NAFIPS). He received the 2007 Fuzzy Systems Pioneer Award from the IEEE Computational Intelligence Society. He finished a full six year term as Editor-in-Chief of the IEEE Transactions on Fuzzy Systems, is an Associate Editor of the International Journal of Approximate Reasoning, and is on the editorial board of Pattern Analysis and Applications, Fuzzy Sets and Systems, International Journal of Fuzzy Systems, and the Journal of Intelligent and Fuzzy Systems. Jim was the Vice President for Publications of the IEEE Computational Intelligence Society from 2005 to 2008, and is currently an elected Adcom member. He was the conference chair of the 1991 NAFIPS Workshop, program co-chair of the 1996 NAFIPS meeting, program co-chair of the 1997 IEEE International Conference on Neural Networks, and the program chair of the 1998 IEEE International Conference on Fuzzy Systems. He was the general chair for the 2003 IEEE International Conference on Fuzzy Systems.
Marimuthu Palaniswami received his M.E. from the Indian Institute of Science, India, M.Eng.Sc. from the University of Melbourne and Ph.D. from the University of Newcastle, Australia before rejoining the University of Melbourne. He has published over 340 refereed research papers. He currently leads one of the largest funded ARC Research Network on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP) programme—that is structured it run as a network centre of excellence with complementary funding for fundamental research, test beds, international linkages and industry linkages. His leadership includes as an external reviewer to an international research centre, a selection panel member for senior appointments/promotions, grants panel member for NSF, advisory board member for European FP6 grant centre, steering committee member for NCRIS GBROOS and SEMAT, and board member for IT and SCADA companies. His research interests include SVMs, Sensors and Sensor Networks, Machine Learning, Neural Network, Pattern Recognition, Signal Processing and Control.