Skip to main content

Study on Data Preprocessing for Daylight Climate Data

  • Conference paper
  • 4798 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7473))

Abstract

It is well konwn that the real-world data tend to exist many data quality problems such as incompleteness and nosiy data. Data preprocessing technology can improve data quality effectively and provide more reliable data for the next step. A data preprocessing approach for daylight climate data is presented in this paper according to the characteristics of this data. Then this approach is applied to the real-world data and the experimental results show that the approach can enhance the data quality effectively. Besides, the integration of the domain knowledge into data preprocessing is emphasized in this paper in order to make data preprocessing more effective and more targeted.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Zhang, N., Lu, W.F.: An Efficient Data Preprocessing Method for Mining Customer Survey Data. In: Proceedings of the 5th IEEE International Conference on Industrial Informatics, vol. 1, pp. 573–578. IEEE Press, Vienna (2007)

    Chapter  Google Scholar 

  2. Han, J.W.: Data Mining: Concepts and Techniques. Higher Education Press, Beijing (2006)

    MATH  Google Scholar 

  3. Aebi, D., Perrochon, L.: Towards improving data quality. In: Proceedings of the International Conference on Information Systems and Management of Data, pp. 273–281. Institution of Engineers, Delhi (1993)

    Google Scholar 

  4. Yu, H.R.: The key technologies research for Data quality and data cleaning. Msc thesis, Fudan University (2002)

    Google Scholar 

  5. Tayi, G.K., Ballou, D.P.: Examining data quality. Communications of the ACM 41(2), 54–57 (1998)

    Article  Google Scholar 

  6. Last, M., Kandel, A.: Automated detection of outliers in real-world data. In: Proceedings of the 2nd International Conference on Intelligent Technologies, Bangkok, Thailand, pp. 292–301 (2001)

    Google Scholar 

  7. Klir, G.J., Yuan, B.: Fuzzy Sets and Fuzzy Logic: Theory and Applications. Prentice Hall, Upper Saddle River (1995)

    MATH  Google Scholar 

  8. Grzymala-Busse, J.W., Hu, M.: A Comparison of Several Approaches to Missing Attribute Values in Data Mining. In: Ziarko, W.P., Yao, Y. (eds.) RSCTC 2000. LNCS (LNAI), vol. 2005, pp. 378–385. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  9. Gustavo, E.A., Batista, P.A., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17(5-6), 519–533 (2003)

    Article  Google Scholar 

  10. Zou, Y., An, A.J., Huang, X.J.: Evaluation and automatic selection of methods for handling missing data. In: IEEE International Conference on Granular Computing, vol. 2, pp. 728–733. IEEE Press, Beijing (2005)

    Google Scholar 

  11. He, Y., Guo, P., Lin, Y.: Study on the sky luminance distribution of information methods by ant colony systems. Applied Mechanics and Materials 48-49, 1202–1207 (2011)

    Article  Google Scholar 

  12. Guo, Z.M., Zhou, A.Y.: Research on Data Quality and Data Cleaning: a Survey. Journal of Software 13(11), 2076–2082 (2002)

    Google Scholar 

  13. Ordonez, C.: Data set preprocessing and transformation in a database system. Intelligent Data Analysis 15(4), 613–631 (2011)

    MathSciNet  Google Scholar 

  14. Heinrich, J., Elter, T., Ulrich, J.: Data Preprocessing of In Situ Laser-Backscattering Measurements. Chemical Engineering and Technology 34(6), 977–984 (2011)

    Article  Google Scholar 

  15. Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)

    Article  Google Scholar 

  16. Outrata, J.: Boolean factor analysis for data preprocessing in machine learning. In: Proceedings of the 9th International Conference on Machine Learning and Applications, pp. 899–902. IEEE Computer Society, Washington, D.C. (2010); IEEE Transactions on Knowledge and Data Engineering

    Chapter  Google Scholar 

  17. Nick, J.P.: Fuzzy quartile encoding as a preprocessing method for biomedical pattern classification. Theoretical Computer Science 412(42), 5909–5925 (2011)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Guo, P., Chen, SS., He, Y. (2012). Study on Data Preprocessing for Daylight Climate Data. In: Liu, B., Ma, M., Chang, J. (eds) Information Computing and Applications. ICICA 2012. Lecture Notes in Computer Science, vol 7473. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34062-8_64

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34062-8_64

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34061-1

  • Online ISBN: 978-3-642-34062-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics