Data Augmentation for Fairness in Personal Knowledge Base Population

Vannur, Lingraj S.; Ganesan, Balaji; Nagalapatti, Lokesh; Patel, Hima; Tippeswamy, M. N.

doi:10.1007/978-3-030-75015-2_15

Lingraj S. Vannur¹⁰,
Balaji Ganesan¹¹,
Lokesh Nagalapatti¹¹,
Hima Patel¹¹ &
…
M. N. Tippeswamy¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12705))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1382 Accesses

Abstract

Cold start knowledge base population (KBP) is the problem of populating a knowledge base from unstructured documents. While neural networks have led to improvements in the different tasks that are part of KBP, the overall F1 of the end-to-end system remains quite low. This problem is more acute in personal knowledge bases, which present additional challenges with regard to data protection, fairness and privacy. In this work, we use data augmentation to populate a more complete personal knowledge base from the TACRED dataset. We then use explainability techniques and representative set sampling to show that the augmented knowledge base is more fair and diverse as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Dataset for Web-Scale Knowledge Base Population

Representativeness of Knowledge Bases with the Generalized Benford’s Law

Incremental knowledge base construction using DeepDive

Article 04 August 2016

References

Alt, C., Gabryszak, A., Hennig, L.: TACRED revisited: a thorough evaluation of the TACRED relation extraction task. arXiv preprint arXiv:2004.14855 (2020)
Angeli, G., et al.: Bootstrapped self training for knowledge base population. In: TAC (2015)
Google Scholar
Balog, K., Kenter, T.: Personal knowledge graphs: a research agenda. In: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, pp. 217–220 (2019)
Google Scholar
Bellamy, R.K., et al.: AI fairness 360: an extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943 (2018)
Chiticariu, L., Krishnamurthy, R., Li, Y., Raghavan, S., Reiss, F.R., Vaithyanathan, S.: SystemT: an algebraic approach to declarative information extraction. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (2010)
Google Scholar
Cohen, A.D., Rosenman, S., Goldberg, Y.: Relation extraction as two-way span-prediction. arXiv preprint arXiv:2010.04829 (2020)
Dasgupta, R., Ganesan, B., Kannan, A., Reinwald, B., Kumar, A.: Fine grained classification of personal data entities. arXiv preprint arXiv:1811.09368 (2018)
Ellis, J., et al.: Overview of linguistic resources for the tac KBP 2015 evaluations: methodologies and results. In: TAC (2015)
Google Scholar
Garg, S., Perot, V., Limtiaco, N., Taly, A., Chi, E.H., Beutel, A.: Counterfactual fairness in text classification through robustness. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pp. 219–226 (2019)
Google Scholar
Huang, Q., Yamada, M., Tian, Y., Singh, D., Yin, D., Chang, Y.: GraphLIME: local interpretable model explanations for graph neural networks. arXiv preprint arXiv:2001.06216 (2020)
Ji, H., Grishman, R., Dang, H.T., Griffitt, K., Ellis, J.: Overview of the tac 2010 knowledge base population track. In: Third Text Analysis Conference (TAC 2010), vol. 3, p. 3 (2010)
Google Scholar
Liberty, E.: Simple and deterministic matrix sketching (2012)
Google Scholar
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, pp. 4765–4774 (2017)
Google Scholar
Mesquita, F., Cannaviccio, M., Schmidek, J., Mirza, P., Barbosa, D.: KnowledgeNet: a benchmark dataset for knowledge base population. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 749–758 (2019)
Google Scholar
Ratner, A., Bach, S.H., Ehrenberg, H., Fries, J., Wu, S., Ré, C.: Snorkel: rapid training data creation with weak supervision. VLDB J. 1–22 (2019)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: Anchors: high-precision model-agnostic explanations. In: AAAI, vol. 18, pp. 1527–1535 (2018)
Google Scholar
Ying, Z., Bourgeois, D., You, J., Zitnik, M., Leskovec, J.: GNNExplainer: generating explanations for graph neural networks. In: Advances in Neural Information Processing Systems, pp. 9244–9255 (2019)
Google Scholar
Yuan, H., Tang, J., Hu, X., Ji, S.: XGNN: towards model-level explanations of graph neural networks. arXiv preprint arXiv:2006.02587 (2020)
Zhang, Y., Zhong, V., Chen, D., Angeli, G., Manning, C.D.: Position-aware attention and supervised data improve slot filling. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 35–45 (2017)
Google Scholar

Download references

Acknowledgement

This work was done as part of the Global Remote Mentoring initiative of IBM University Relations to promote undergraduate student research. We thank Kalapriya Kannan, Dinesh Garg, Poornima Iyengar, Kranti Athalye, and Nitte Meenakshi Institute of Technology for their support.

Author information

Authors and Affiliations

Nitte Meenakshi Institute of Technology, Bengaluru, India
Lingraj S. Vannur & M. N. Tippeswamy
IBM Research, Bengaluru, India
Balaji Ganesan, Lokesh Nagalapatti & Hima Patel

Authors

Lingraj S. Vannur
View author publications
You can also search for this author in PubMed Google Scholar
Balaji Ganesan
View author publications
You can also search for this author in PubMed Google Scholar
Lokesh Nagalapatti
View author publications
You can also search for this author in PubMed Google Scholar
Hima Patel
View author publications
You can also search for this author in PubMed Google Scholar
M. N. Tippeswamy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Balaji Ganesan .

Editor information

Editors and Affiliations

Microsoft, Hyderabad, India
Manish Gupta
Indian Institute of Technology Bombay, Mumbai, India
Ganesh Ramakrishnan

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 136 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vannur, L.S., Ganesan, B., Nagalapatti, L., Patel, H., Tippeswamy, M.N. (2021). Data Augmentation for Fairness in Personal Knowledge Base Population. In: Gupta, M., Ramakrishnan, G. (eds) Trends and Applications in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12705. Springer, Cham. https://doi.org/10.1007/978-3-030-75015-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-75015-2_15
Published: 03 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75014-5
Online ISBN: 978-3-030-75015-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Data Augmentation for Fairness in Personal Knowledge Base Population

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Dataset for Web-Scale Knowledge Base Population

Representativeness of Knowledge Bases with the Generalized Benford’s Law

Incremental knowledge base construction using DeepDive

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 136 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Data Augmentation for Fairness in Personal Knowledge Base Population

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Dataset for Web-Scale Knowledge Base Population

Representativeness of Knowledge Bases with the Generalized Benford’s Law

Incremental knowledge base construction using DeepDive

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 136 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation