Using Knowledge Graph to Handle Label Imperfection

Liu, Yi; Li, Huakang; Chen, Yizheng

doi:10.1007/978-3-319-13186-3_32

Yi Liu¹¹,
Huakang Li¹² &
Yizheng Chen¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8643))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2300 Accesses

Abstract

The performance of classification tasks extremely relies on data quality, while in real world label noises inevitably exists because of data entry errors, transmit errors and subjectivity of taggers. Different methods have been proposed to deal with label imperfection, including robust algorithms by avoid overfitting, filtering mechanism to remove noises and correction mechanism to revise noises. In this paper, we propose an approach based on knowledge graph to perceive and correct the label errors in training data. Experiments on a medical Q&A data set reveal that our knowledge graph based approach can be effective on promoting classification performance and data quality. The results as well show our approach can work in a relatively high noise level and be applied in other data mining tasks demanding deep understanding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Improving the recall of biomedical named entity recognition with label re-correction and knowledge distillation

Article Open access 02 June 2021

Learning from group supervision: the impact of supervision deficiency on multi-label learning

Article 07 February 2021

Noisy Label Learning in Deep Learning

Notes

References

Zhu, X., Wu, X.: Class noise vs. attribute noise: a quantitative study. Artif. Intell. Rev. 22(3), 177–210 (2004)
Article MATH MathSciNet Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 481–492. ACM (2012)
Google Scholar
Zhang, Y.: Contextualizing consumer health information searching: an analysis of questions in a social Q&A community. In: Proceedings of the 1st ACM International Health Informatics Symposium, pp. 210–219. ACM (2010)
Google Scholar
Kunz, H., Schaaf, T.: General and specific formalization approach for a balanced scorecard: an expert system with application in health care. Expert Syst. Appl. 38(3), 1947–1955 (2011)
Article Google Scholar
Zeng, X., Martinez, T.R.: An algorithm for correcting mislabeled data. Intell. Data Anal. 5(6), 491–502 (2001)
MATH Google Scholar
Wilson, D.R., Martinez, T.R.: Instance pruning techniques. In: ICML, vol. 97, pp. 403–411 (1997)
Google Scholar
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Mach. Learn. 38(3), 257–286 (2000)
Article MATH Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. 2(3), 408–421 (1972)
Article MATH Google Scholar
Aha, D.W., Kibler, D.F.: Noise-tolerant instance-based learning algorithms. In: IJCAI, Citeseer, pp. 794–799 (1989)
Google Scholar
Brodley, C.E., Friedl, M.A.: Identifying and eliminating mislabeled training instances. In: AAAI/IAAI, Citeseer, vol. 1, pp. 799–805 (1996)
Google Scholar
Brodley, C.E., Friedl, M.A.: Identifying mislabeled training data (2011). arXiv preprint arXiv:1106.0219
Teng, C.M.: Evaluating noise correction. In: Mizoguchi, R., Slaney, J.K. (eds.) PRICAI 2000. LNCS, vol. 1886, pp. 188–198. Springer, Heidelberg (2000)
Chapter Google Scholar
Teng, C.M.: Polishing blemishes: Issues in data correction. IEEE Intell. Syst. 19(2), 34–39 (2004)
Article Google Scholar
Teng, C.M.: A comparison of noise handling techniques. In: FLAIRS Conference, pp. 269–273 (2001)
Google Scholar
Li, J., Zhang, K., et al.: Keyword extraction based on tf/idf for chinese news document. Wuhan Univ. J. Nat. Sci. 12(5), 917–921 (2007)
Article Google Scholar
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: ICML, vol. 97, pp. 412–420 (1997)
Google Scholar
McCallum, A., Nigam, K., et al.: A comparison of event models for naive bayes text classification. In: AAAI-98 workshop on learning for text categorization, Citeseer, vol. 752, pp. 41–48 (1998)
Google Scholar

Download references

Acknowledgements

This work was supported by the NSFC (No. 61272099, 61261160502 and 61202025), Shanghai Excellent Academic Leaders Plan (No. 11XD1402900), the Program for Changjiang Scholars and Innovative Research Team in University of China (IRT1158, PCSIRT), the Scientific Innovation Act of STCSM (No. 13511504200), Singapore NRF (CREATE E2S2), and the EU FP7 CLIMBER project (No. PIRSES-GA-2012-318939).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Yi Liu & Yizheng Chen
Department of Computer Sincere and Technology, School of Computer Science and Technology, School of Software, Nanjing University of Posts and Telecommunications, Nanjing, China
Huakang Li

Authors

Yi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Huakang Li
View author publications
You can also search for this author in PubMed Google Scholar
Yizheng Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Liu .

Editor information

Editors and Affiliations

National Chiao Tung University, Hsinchu, Taiwan
Wen-Chih Peng
Google Research, Mountain View, California, USA
Haixun Wang
University of Melbourne, Melbourne, Victoria, Australia
James Bailey
National Cheng Kung University, Tainan, Taiwan
Vincent S. Tseng
Japan Advanced Institute of Science and Technology, Nomi City, Japan
Tu Bao Ho
Nanjing University, Nanjing, China
Zhi-Hua Zhou
National Chengchi University, Taipei, Taiwan
Arbee L.P. Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Y., Li, H., Chen, Y. (2014). Using Knowledge Graph to Handle Label Imperfection. In: Peng, WC., et al. Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2014. Lecture Notes in Computer Science(), vol 8643. Springer, Cham. https://doi.org/10.1007/978-3-319-13186-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-319-13186-3_32
Published: 26 November 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13185-6
Online ISBN: 978-3-319-13186-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Using Knowledge Graph to Handle Label Imperfection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving the recall of biomedical named entity recognition with label re-correction and knowledge distillation

Learning from group supervision: the impact of supervision deficiency on multi-label learning

Noisy Label Learning in Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Using Knowledge Graph to Handle Label Imperfection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Improving the recall of biomedical named entity recognition with label re-correction and knowledge distillation

Learning from group supervision: the impact of supervision deficiency on multi-label learning

Noisy Label Learning in Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation