skip to main content
10.1145/3318299.3318384acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

Air Big Data Outlier Detection Based on Infinite Gauss Bayesian and CNN

Authors Info & Claims
Published:22 February 2019Publication History

ABSTRACT

Air quality has always been a hot issue of concern to the people, the environmental protection department and the government. Among the massive air quality data, abnormal data can interfere with subsequent experiments and analysis. Therefore, it is necessary to detect abnormal data to improve the accuracy of the data. However, traditional air outlier detection methods require at least one year's data to make inferences about air quality. This paper firstly analyzes the characteristics of air quality big data, and then proposes a framework based on Bayesian non-parametric clustering, namely Dirichlet Process (DP) clustering framework, to realize the outlier detection of air quality. The framework optimizes Gaussian mixture model into infinite Gaussian mixture model according to the results of data analysis, and uses neural network to cluster the data processed by infinite Gaussian mixture model, which effectively improves the clustering accuracy and avoids the need of collecting a large number of training data.

References

  1. Meng, K. 2017. Research on Recognition Technology of Hollow CAPTCHAs Based on SVM. Chongqing University of Posts and Telecommunications.Google ScholarGoogle Scholar
  2. Zhang, Z. Y. 2017. The Design and Implementation of Verification Code Recognition Module in "Tianyancha" Distributed Crawl System. Beijing Jiaotong University.Google ScholarGoogle Scholar
  3. Chen, R., Huang, S. G., Ye, C. M. and Zhang, L. 2014. CAPTCHA Recognition Based on Two Dimensional RNN. Journal of Chinese Computer Systems. 3503:504--508.Google ScholarGoogle Scholar
  4. Fan, W., Han, J. G., Gou, F. and Li, S. 2018. Chinese character CAPTCHA recognition based on convolution neural network. Computer Engineering and Applications. 54(3):160--165.Google ScholarGoogle Scholar
  5. Jian, X. Z., Cao, S. J. and Guo, X. 2015. Segmentation of CAPTCHA characters based on self-organizing maps and Voronoi. Application Research of Computers.Google ScholarGoogle Scholar
  6. Ying, L. 2014. Recognition of Distorted and Merged Text-based CAPTCHA. University of Science and Technology of China.Google ScholarGoogle Scholar
  7. Goto, M., Shirato, T., Uda, R. 2014. Text-Based CAPTCHA Using Phonemic Restoration Effect and Similar Sounds. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Stark, F., Hazirbas, C., Triebel, R. and Cremers. 2015. Captcha recognition with active deep learning. In GCPR Workshop on New Challenges in Neural Computation (Vol. 10).Google ScholarGoogle Scholar
  9. Arain, R. H., Shaikh, R. A., Maitlo, A., Kumar, K. and Shah, S. S. A. 2018. A deep learning model for recognition of complex Text-based CAPTCHAs. IJCSNS 18.2 (2018): 103.Google ScholarGoogle Scholar
  10. Li, K. S. 2014. The Research on Recognition Technology of Chinese Character CAPTCHA. Xidian University.Google ScholarGoogle Scholar
  11. C.E. Rasmussen. The infinite Gaussian mixture model. Advances in neural information processing systems, pages 554--560, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R.M. Neal, Markov chain sampling methods for Dirichlet process mixture models, J. Comput. Graphical Stat. 9 (2) (2000) 249--265.Google ScholarGoogle Scholar

Index Terms

  1. Air Big Data Outlier Detection Based on Infinite Gauss Bayesian and CNN

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICMLC '19: Proceedings of the 2019 11th International Conference on Machine Learning and Computing
      February 2019
      563 pages
      ISBN:9781450366007
      DOI:10.1145/3318299

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 February 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader