Weight Aware Feature Enriched Biomedical Lexical Answer Type Prediction

Peng, Keqin; Rong, Wenge; Li, Chen; Hu, Jiahao; Xiong, Zhang

doi:10.1007/978-3-030-63836-8_6

Keqin Peng^14,15,
Wenge Rong^14,16,
Chen Li^14,15,
Jiahao Hu^14,15 &
…
Zhang Xiong^14,16

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12534))

Included in the following conference series:

International Conference on Neural Information Processing

1939 Accesses

Abstract

Lexical Answer Type (LAT) prediction is an essential part of question classification. It aims to assign certain lexical answer type to the questions to narrow down the search space and improve the classifier’s performance. LAT prediction is a challenge in the biomedical domain since it is more of a multi-label classification question, which means each question has more than one label. In this paper, we employ the Label Powerset method to transform multi-label classification problems into multi-classification problems. Afterwards we introduced a random forest based mechanism to partition the features into used (important) and unused (unimportant) sets with corresponding weights. Furthermore, by assuming that the unimportant features are not useless, we employ principal components analysis to get the information from the unused feature set. By combing these two types of features, the experimental study on the BioMedLAT dataset has demonstrated our method’s potential.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Our source code is available at https://github.com/Romainpkq/LATPrediction.
2.
https://github.com/clir/clearnlp.
3.
https://github.com/wasimbhalli/Multi-label-Biomedical-QC-Corpus.

References

Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240 (2006)
Google Scholar
Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005). https://doi.org/10.1007/11573036_42
Chapter Google Scholar
Ferrucci, D.A., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
Article Google Scholar
Gliozzo, A.M., Kalyanpur, A.: Predicting lexical answer types in open domain QA. Int. J. Semant. Web Inf. Syst. 8(3), 74–88 (2012)
Article Google Scholar
Li, Y., Su, L., Chen, J., Yuan, L.: Semi-supervised learning for question classification in CQA. Nat. Comput. 16(4), 567–577 (2016). https://doi.org/10.1007/s11047-016-9554-5
Article MathSciNet Google Scholar
Liaw, A., Wiener, M.: Classification and regression by RandomForest. R News 2(3), 18–22 (2002)
Google Scholar
Mollá, D., González, J.L.V.: Question answering in restricted domains: an overview. Comput. Linguist. 33(1), 41–61 (2007)
Article Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010)
Google Scholar
Neves, M., Kraus, M.: BioMedLAT corpus: annotation of the lexical answer type for biomedical questions. In: Proceedings of the Open Knowledge Base and Question Answering Workshop, pp. 49–58 (2016)
Google Scholar
Neves, M., Leser, U.: Question answering for biology. Methods 74, 36–46 (2015)
Article Google Scholar
Peng, S., You, R., Xie, Z., Wang, B., Zhang, Y., Zhu, S.: The Fudan participation in the 2015 BioASQ challenge: large-scale biomedical semantic indexing and question answering. In: Proceedings of Working Notes of CLEF 2015 Conference and Labs of the Evaluation Forum (2015)
Google Scholar
Sarrouti, M., Alaoui, S.O.E.: SemBioNLQA: a semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions. Artif. Intell. Med. 102, 101767 (2020)
Article Google Scholar
Sarrouti, M., Lachkar, A., Ouatik, S.E.A.: Biomedical question types classification using syntactic and rule based approach. In: Proceedings of 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, vol. 1, pp. 265–272. IEEE (2015)
Google Scholar
Schulze, F., et al.: HPI question answering system in BioASQ 2016. In: Proceedings of the 4th BioASQ Workshop, pp. 38–44 (2016)
Google Scholar
Shin, M., Jang, D., Nam, H., Lee, K.H., Lee, D.: Predicting the absorption potential of chemical compounds through a deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 15(2), 432–440 (2016)
Article Google Scholar
Tsatsaronis, G., et al.: An overview of the BioASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 138:1–138:28 (2015)
Article Google Scholar
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse. Min. 3(3), 1–13 (2007)
Article Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: A review of multi-label classification methods. In: Proceedings of the 2nd ADBIS workshop on Data Mining and Knowledge Discovery, pp. 99–109 (2006)
Google Scholar
Wasim, M., Asim, M.N., Khan, M.U.G., Mahmood, W.: Multi-label biomedical question classification for lexical answer type prediction. J. Biomed. Inform. 93 (2019)
Google Scholar
Weissenborn, D., Tsatsaronis, G., Schroeder, M.: Answering factoid questions in the biomedical domain. In: Proceedings of the 1st Workshop on Bio-Medical Semantic Indexing and Question Answering (2013)
Google Scholar
Yang, Z., Gupta, N., Sun, X., Xu, D., Zhang, C., Nyberg, E.: Learning to answer biomedical factoid & list questions: OAQA at BioASQ 3B. In: Proceedings of Working Notes of CLEF 2015 Conference and Labs of the Evaluation Forum (2015)
Google Scholar
Yao, Y., Zhou, B.: Micro and macro evaluation of classification rules. In: Proceedings of 7th IEEE International Conference on Cognitive Informatics, pp. 441–448 (2008)
Google Scholar
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2013)
Article Google Scholar

Download references

Acknowledgement

This work was partially supported by State Key Laboratory of Software Development Environment of China (No. SKLSDE-2019ZX-16).

Author information

Authors and Affiliations

State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China
Keqin Peng, Wenge Rong, Chen Li, Jiahao Hu & Zhang Xiong
Sino-French Engineer School, Beihang University, Beijing, 100191, China
Keqin Peng, Chen Li & Jiahao Hu
School of Computer Science and Engineering, Beihang University, Beijing, China
Wenge Rong & Zhang Xiong

Authors

Keqin Peng
View author publications
You can also search for this author in PubMed Google Scholar
Wenge Rong
View author publications
You can also search for this author in PubMed Google Scholar
Chen Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiahao Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhang Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenge Rong .

Editor information

Editors and Affiliations

Department of AI, Ping An Life, Shenzhen, China
Haiqin Yang
Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Kitsuchart Pasupa
City University of Hong Kong, Kowloon, Hong Kong
Andrew Chi-Sing Leung
Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong, Hong Kong
James T. Kwok
School of Information Technology, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand
Jonathan H. Chan
The Chinese University of Hong Kong, New Territories, Hong Kong
Irwin King

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, K., Rong, W., Li, C., Hu, J., Xiong, Z. (2020). Weight Aware Feature Enriched Biomedical Lexical Answer Type Prediction. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-63836-8_6
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics