skip to main content
10.1145/3373017.3373032acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacswConference Proceedingsconference-collections
research-article

Estimation of Locally Relevant Subspace in High-dimensional Data

Published: 04 February 2020 Publication History

Abstract

High-dimensional data is becoming more and more available due to the advent of big data and IoT. Having more dimensions makes data analysis cumbersome increasing the sparsity of data points due to the problem called “curse of dimensionality“. To address this problem, global dimensionality reduction techniques are used; however, these techniques are ineffective in revealing hidden outliers from the high-dimensional space. This is due to the behaviour of outliers being hidden in the subspace where they belong; hence, a locally relevant subspace is needed to reveal the hidden outliers. In this paper, we present a technique that identifies a locally relevant subspace and associated low-dimensional subspaces by deriving a final correlation score. To verify the effectiveness of the technique in determining the generalised locally relevant subspace, we evaluate the results with a benchmark data set. Our comparative analysis shows that the technique derived the locally relevant subspace that consists of relevant dimensions presented in benchmark data set.

References

[1]
Charu C Aggarwal. 2017. High-Dimensional Outlier Detection: The Subspace Method. In Outlier Analysis. Springer, 149–184.
[2]
Marcelo Bacher, Irad Ben-Gal, and Erez Shmueli. 2017. An Information Theory Subspace Analysis Approach with Application to Anomaly Detection Ensembles. In KDIR. 27–39.
[3]
Ruben Becker, Imane Hafnaoui, Michael E Houle, Pan Li, and Arthur Zimek. 2019. Subspace Determination through Local Intrinsic Dimensional Decomposition: Theory and Experimentation. arXiv preprint arXiv:1907.06771(2019).
[4]
Christian Bohm, K Railing, H-P Kriegel, and Peer Kroger. 2004. Density connected clustering with local subspace preferences. In Fourth IEEE International Conference on Data Mining (ICDM’04). IEEE, 27–34.
[5]
Fabian Keller, Emmanuel Muller, and Klemens Bohm. 2012. HiCS: High contrast subspaces for density-based outlier ranking. In 2012 IEEE 28th international conference on data engineering. IEEE, 1037–1048.
[6]
Hans-Peter Kriegel, Peer Kröger, Erich Schubert, and Arthur Zimek. 2012. Outlier detection in arbitrarily oriented subspaces. In 2012 IEEE 12th international conference on data mining. IEEE, 379–388.
[7]
Aleksandar Lazarevic and Vipin Kumar. 2005. Feature bagging for outlier detection. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 157–166.
[8]
Jake Lever, Martin Krzywinski, and Naomi Altman. 2017. Points of significance: Principal component analysis.
[9]
Emmanuel Muller, Ira Assent, Uwe Steinhausen, and Thomas Seidl. 2008. OutRank: ranking outliers in high dimensional data. In 2008 IEEE 24th international conference on data engineering workshop. IEEE, 600–603.
[10]
Emmanuel Müller, Matthias Schiffer, and Thomas Seidl. 2011. Statistical selection of relevant subspace projections for outlier ranking. In 2011 IEEE 27th international conference on data engineering. IEEE, 434–445.
[11]
Hoang Vu Nguyen, Vivekanand Gopalkrishnan, and Ira Assent. 2011. An unbiased distance-based outlier detection approach for high-dimensional data. In International Conference on Database Systems for Advanced Applications. Springer, 138–152.
[12]
Saim Raza, Hing-Ho Tsang, and John L Wilson. 2018. Unified models for post-peak failure drifts of normal-and high-strength RC columns. Magazine of Concrete Research 70, 21 (2018), 1081–1101.
[13]
Srikanth Thudumu, Philip Branch, Jiong Jin, and Jugdutt (Jack) Singh. 2019. Elicitation of Candidate Subspaces in High-dimensional Data. In 2019 IEEE 21st International Conference on High Performance Computing and Communications (HPCC). 1995–2000. https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00275
[14]
Srikanth Thudumu, Philip Branch, Jiong Jin, and Jugdutt (Jack) Singh. in press 2019. Adaptive clustering for outlier identification in high-dimensional data. In 19th International Conference on Algorithms and Architectures for Parallel Processing(ICA3PP 2019).
[15]
Holger Trittenbach and Klemens Böhm. 2019. Dimension-based subspace search for outlier detection. International Journal of Data Science and Analytics 7, 2 (2019), 87–101.
[16]
Bas van Stein, Matthijs van Leeuwen, and Thomas Bäck. 2016. Local subspace-based outlier detection using global neighbourhoods. In 2016 IEEE International Conference on Big Data (Big Data). IEEE, 1136–1142.
[17]
Jifu Zhang, Xiaolong Yu, Yonghong Li, Sulan Zhang, Yaling Xun, and Xiao Qin. 2016. A relevant subspace based contextual outlier mining algorithm. Knowledge-Based Systems 99 (2016), 1–9.

Cited By

View all
  • (2024)Exploring High-Dimensional Outlier Detection: A Comprehensive Study on Methods and Applications Using PCA and k-NN Algorithm2024 21st International Multi-Conference on Systems, Signals & Devices (SSD)10.1109/SSD61670.2024.10548429(699-704)Online publication date: 22-Apr-2024
  • (2023)Quantized autoencoder (QAE) intrusion detection system for anomaly detection in resource-constrained IoT devices using RT-IoT2022 datasetCybersecurity10.1186/s42400-023-00178-56:1Online publication date: 5-Sep-2023
  • (2023)QAE-IDS: DDoS anomaly detection in IoT devices using Post-Quantization TrainingSmart Science10.1080/23080477.2023.226002311:4(774-789)Online publication date: 23-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACSW '20: Proceedings of the Australasian Computer Science Week Multiconference
February 2020
367 pages
ISBN:9781450376976
DOI:10.1145/3373017
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 February 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. High-dimensionality problem
  2. Locally Relevant subspace
  3. Outlier Detection
  4. Subspace methods
  5. The curse of dimensionality problem

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ACSW '20
ACSW '20: Australasian Computer Science Week 2020
February 4 - 6, 2020
VIC, Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 61 of 141 submissions, 43%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Exploring High-Dimensional Outlier Detection: A Comprehensive Study on Methods and Applications Using PCA and k-NN Algorithm2024 21st International Multi-Conference on Systems, Signals & Devices (SSD)10.1109/SSD61670.2024.10548429(699-704)Online publication date: 22-Apr-2024
  • (2023)Quantized autoencoder (QAE) intrusion detection system for anomaly detection in resource-constrained IoT devices using RT-IoT2022 datasetCybersecurity10.1186/s42400-023-00178-56:1Online publication date: 5-Sep-2023
  • (2023)QAE-IDS: DDoS anomaly detection in IoT devices using Post-Quantization TrainingSmart Science10.1080/23080477.2023.226002311:4(774-789)Online publication date: 23-Sep-2023
  • (2022)Subspace based Anomaly Detection Framework for Point Clouds2022 IEEE 18th International Conference on e-Science (e-Science)10.1109/eScience55777.2022.00045(316-325)Online publication date: Oct-2022
  • (2020)A comprehensive survey of anomaly detection techniques for high dimensional big dataJournal of Big Data10.1186/s40537-020-00320-x7:1Online publication date: 2-Jul-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media