Skip to main content

UNN: A Neural Network for Uncertain Data Classification

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6118))

Abstract

This paper proposes a new neural network method for classifying uncertain data (UNN). Uncertainty is widely spread in real-world data. Numerous factors lead to data uncertainty including data acquisition device error, approximate measurement, sampling fault, transmission latency, data integration error and so on. The performance and quality of data mining results are largely dependent on whether data uncertainty are properly modeled and processed. In this paper, we focus on one commonly encountered type of data uncertainty - the exact data value is unavailable and we only know the probability distribution of the data. An intuitive method of handling this type of uncertainty is to represent the uncertain range by its expectation value, and then process it as certain data. This method, although simple and straightforward, may cause valuable information loss. In this paper, we extend the conventional neural networks classifier so that it can take not only certain data but also uncertain probability distribution as the input. We start with designing uncertain perceptron in linear classification, and analyze how neurons use the new activation function to process data distribution as inputs. We then illustrate how perceptron generates classification principles upon the knowledge learned from uncertain training data. We also construct a multilayer neural network as a general classifier, and propose an optimization technique to accelerate the training process. Experiment shows that UNN performs well even for highly uncertain data and it significantly outperformed the naïve neural network algorithm. Furthermore, the optimization approach we proposed can greatly improve the training efficiency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Yu, P.: A framework for clustering uncertain data streams. In: IEEE International Conference on Data Engineering, ICDE (2008)

    Google Scholar 

  2. Cormode, G., McGregor, A.: Approximation algorithms for clustering uncertain data. In: Principle of Data base System, PODS (2008)

    Google Scholar 

  3. Kriegel, H., Pfeifle, M.: Density-based clustering of uncertain data. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 672–677 (2005)

    Google Scholar 

  4. Singh, S., Mayfield, C., Prabhakar, S., Shah, R., Hambrusch, S.: Indexing categorical data with uncertainty. In: IEEE International Conference on Data Engineering (ICDE), pp. 616–625 (2007)

    Google Scholar 

  5. Kriegel, H., Pfeifle, M.: Hierarchical density-based clustering of uncertain data. In: IEEE International Conference on Data Mining (ICDM), pp. 689–692 (2005)

    Google Scholar 

  6. Aggarwal, C.C.: On Density Based Transforms for uncertain Data Mining. In: IEEE International Conference on Data Engineering, ICDE (2007)

    Google Scholar 

  7. Aggarwal, C.C.: A Survey of Uncertain Data Algorithms and Applications. IEEE Transactions on Knowledge and Data Engineering 21(5) (2009)

    Google Scholar 

  8. Agrawal, R., Imielinski, T., Swami, A.N.: Database mining: A performance perspective. IEEE Transactions on Knowledge& Data Engineering (1993)

    Google Scholar 

  9. Chau, M., Cheng, R., Kao, B., Ng, J.: Uncertain data mining: An example in clustering location data. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 199–204. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  10. Bi, J., Zhang, T.: Support Vector Machines with Input Data Uncertainty. In: Proceeding of Advances in Neural Information Processing Systems (2004)

    Google Scholar 

  11. Qin, B., Xia, Y., Li, F.: DTU: A Decision Tree for Classifying Uncertain Data. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 4–15. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. Cheng, S.-S., Fu, H.-C., Wang, H.-M.: Model-Based Clustering by Probabilistic Self-Organizing Maps. IEEE Transactions on Neural Networks 20(5) (2009)

    Google Scholar 

  13. Kulkarni, A.D., Muniganti, V.K.: Fuzzy Neural Network Models For Clustering. In: ACM Symposium on Applied Computing (1996)

    Google Scholar 

  14. Stan, O., Kamen, E.W.: New block recursive MLP training algorithms using the Levenberg-Marquardt algorithm. Neural Networks 3 (1999)

    Google Scholar 

  15. Matlab: http://www.mathworks.com/

  16. Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/mlearn/MLRepository.html

  17. Aggarwal, C.C.: Managing and Mining Uncertain Data. Springer, Heidelberg (2009)

    MATH  Google Scholar 

  18. Jang, J.S.R.: ANFIS: Adaptive-network-based fuzzy inference system. IEEE transactions on systems, man, and cybernetics 23, 665–685 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ge, J., Xia, Y., Nadungodage, C. (2010). UNN: A Neural Network for Uncertain Data Classification. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. Lecture Notes in Computer Science(), vol 6118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13657-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13657-3_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13656-6

  • Online ISBN: 978-3-642-13657-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics