Skip to main content

Truth Discovery in Material Science Databases

  • Conference paper
  • First Online:
Book cover Databases Theory and Applications (ADC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9093))

Included in the following conference series:

  • 1481 Accesses

Abstract

Instead of performing expensive experiments, it is common in industry to make predictions of important material properties based on some existing experimental results. Databases consisting of experimental observations are widely used in the field of Material Science Engineering. However, these databases are expected to be noisy since they rely on human measurements, and also because they are an amalgamation of various independent sources (research papers). Therefore, some conflicting information can be found between various sources. In this paper, we introduce a novel truth discovery approach to reduce the amount of noise and filter the incorrect conflicting information hidden in the scientific databases. Our method ranks the multiple data sources by considering the relationships between them, i.e., the amount of conflicting information and the amount of agreement, and as well eliminates the conflicting information. The scalable Gaussian process interpolation technique (SGP) is then applied to the clean dataset to make predictions of materials property. Comprehensive performance study has been done on a real life scientific database. With our new approach, we are able to highly improve the accuracy of SGP predictions and provide a more reliable database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bélisle, E., Huang, Z., Gheribi, A.: Scalable gaussian process regression for prediction of material properties. In: Wang, H., Sharaf, M.A. (eds.) ADC 2014. LNCS, vol. 8506, pp. 38–49. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  2. Bélisle, E., Huang, Z., Le Digabel, S., Gheribi, A.E.: Evaluation of machine learning interpolation techniques for prediction of physical properties. Computational Materials Science 98, 170–177 (2015)

    Article  Google Scholar 

  3. Besses, B.D.D.: Xongrid interpolation add-in (2015)

    Google Scholar 

  4. Birol, B., Polat, G., Saridede, M.: Estimation model for electrical conductivity of molten caf2-al2o3-cao slags based on optical basicity. JOM, pp. 1–9 (2014)

    Google Scholar 

  5. Dekel, O., Shamir, O.: Vox populi: Collecting high-quality labels from a crowd (2009)

    Google Scholar 

  6. Dong, X.L., Berti-Equille, L., Srivastava, D.: Truth discovery and copying detection in a dynamic world. Proceedings of the VLDB Endowment 2(1), 562–573 (2009)

    Article  Google Scholar 

  7. Dong, X.L., Saha, B., Srivastava, D.: Less is more: Selecting sources wisely for integration. In: Proceedings of the VLDB Endowment, vol. 6, pp. 37–48. VLDB Endowment (2012)

    Google Scholar 

  8. Gheribi, A., Audet, C., Digabel, S.L., Bélisle, E., Bale, C., Pelton, A.: Calculating optimal conditions for alloy and process design using thermodynamic and property databases, the factsage software and the mesh adaptive direct search algorithm. Calphad 36, 135–143 (2012)

    Article  Google Scholar 

  9. Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  10. Mualem, Y., Friedman, S.P.: Theoretical prediction of electrical conductivity in saturated and unsaturated soil. Water Resources Research 27(10), 2771–2777 (1991)

    Article  Google Scholar 

  11. Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 614–622. ACM (2008)

    Google Scholar 

  12. Sourmail, T., Garcia-Mateo, C.: Critical assessment of models for predicting the ms temperature of steels. Computational Materials Science 34(4), 323–334 (2005)

    Article  Google Scholar 

  13. Tsuboi, H., Chutia, A., Lv, C., Zhu, Z., Onuma, H., Miura, R., Suzuki, A., Sahnoun, R., Koyama, M., Hatakeyama, N., Endou, A., Takaba, H., Carpio, C.A.D., Deka, R.C., Kubo, M., Miyamoto, A.: An electrical conductivity prediction simulator based on tb-qcmd and kmc. system development and applications. Journal of Molecular Structure: THEOCHEM, 903(1–3):11–22, Recent advances in the theoretical understanding of catalysis (2009)

    Google Scholar 

  14. Wang, D., Kaplan, L., Le, H., Abdelzaher, T.: On truth discovery in social sensing: A maximum likelihood estimation approach. In: Proceedings of the 11th International Conference on Information Processing in Sensor Networks, pp. 233–244. ACM (2012)

    Google Scholar 

  15. Yin, X., Han, J., Yu, P.: Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering 20(6), 796–808 (2008)

    Article  Google Scholar 

  16. Yin, X., Tan, W.: Semi-supervised truth discovery. In: Proceedings of the 20th International Conference on World Wide Web, pp. 217–226. ACM (2011)

    Google Scholar 

  17. Zhao, Z., Cheng, J., Ng W.: Truth discovery in data streams: A single-pass probabilistic approach. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 1589–1598. ACM (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eve Bélisle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bélisle, E., Huang, Z., Gheribi, A. (2015). Truth Discovery in Material Science Databases. In: Sharaf, M., Cheema, M., Qi, J. (eds) Databases Theory and Applications. ADC 2015. Lecture Notes in Computer Science(), vol 9093. Springer, Cham. https://doi.org/10.1007/978-3-319-19548-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19548-3_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19547-6

  • Online ISBN: 978-3-319-19548-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics