Skip to main content

Weight Aware Feature Enriched Biomedical Lexical Answer Type Prediction

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12534))

Included in the following conference series:

  • 1939 Accesses

Abstract

Lexical Answer Type (LAT) prediction is an essential part of question classification. It aims to assign certain lexical answer type to the questions to narrow down the search space and improve the classifier’s performance. LAT prediction is a challenge in the biomedical domain since it is more of a multi-label classification question, which means each question has more than one label. In this paper, we employ the Label Powerset method to transform multi-label classification problems into multi-classification problems. Afterwards we introduced a random forest based mechanism to partition the features into used (important) and unused (unimportant) sets with corresponding weights. Furthermore, by assuming that the unimportant features are not useless, we employ principal components analysis to get the information from the unused feature set. By combing these two types of features, the experimental study on the BioMedLAT dataset has demonstrated our method’s potential.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Our source code is available at https://github.com/Romainpkq/LATPrediction.

  2. 2.

    https://github.com/clir/clearnlp.

  3. 3.

    https://github.com/wasimbhalli/Multi-label-Biomedical-QC-Corpus.

References

  1. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)

    Article  Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324

    Article  MATH  Google Scholar 

  3. Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240 (2006)

    Google Scholar 

  4. Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005). https://doi.org/10.1007/11573036_42

    Chapter  Google Scholar 

  5. Ferrucci, D.A., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)

    Article  Google Scholar 

  6. Gliozzo, A.M., Kalyanpur, A.: Predicting lexical answer types in open domain QA. Int. J. Semant. Web Inf. Syst. 8(3), 74–88 (2012)

    Article  Google Scholar 

  7. Li, Y., Su, L., Chen, J., Yuan, L.: Semi-supervised learning for question classification in CQA. Nat. Comput. 16(4), 567–577 (2016). https://doi.org/10.1007/s11047-016-9554-5

    Article  MathSciNet  Google Scholar 

  8. Liaw, A., Wiener, M.: Classification and regression by RandomForest. R News 2(3), 18–22 (2002)

    Google Scholar 

  9. Mollá, D., González, J.L.V.: Question answering in restricted domains: an overview. Comput. Linguist. 33(1), 41–61 (2007)

    Article  Google Scholar 

  10. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010)

    Google Scholar 

  11. Neves, M., Kraus, M.: BioMedLAT corpus: annotation of the lexical answer type for biomedical questions. In: Proceedings of the Open Knowledge Base and Question Answering Workshop, pp. 49–58 (2016)

    Google Scholar 

  12. Neves, M., Leser, U.: Question answering for biology. Methods 74, 36–46 (2015)

    Article  Google Scholar 

  13. Peng, S., You, R., Xie, Z., Wang, B., Zhang, Y., Zhu, S.: The Fudan participation in the 2015 BioASQ challenge: large-scale biomedical semantic indexing and question answering. In: Proceedings of Working Notes of CLEF 2015 Conference and Labs of the Evaluation Forum (2015)

    Google Scholar 

  14. Sarrouti, M., Alaoui, S.O.E.: SemBioNLQA: a semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions. Artif. Intell. Med. 102, 101767 (2020)

    Article  Google Scholar 

  15. Sarrouti, M., Lachkar, A., Ouatik, S.E.A.: Biomedical question types classification using syntactic and rule based approach. In: Proceedings of 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, vol. 1, pp. 265–272. IEEE (2015)

    Google Scholar 

  16. Schulze, F., et al.: HPI question answering system in BioASQ 2016. In: Proceedings of the 4th BioASQ Workshop, pp. 38–44 (2016)

    Google Scholar 

  17. Shin, M., Jang, D., Nam, H., Lee, K.H., Lee, D.: Predicting the absorption potential of chemical compounds through a deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 15(2), 432–440 (2016)

    Article  Google Scholar 

  18. Tsatsaronis, G., et al.: An overview of the BioASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 138:1–138:28 (2015)

    Article  Google Scholar 

  19. Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse. Min. 3(3), 1–13 (2007)

    Article  Google Scholar 

  20. Tsoumakas, G., Katakis, I., Vlahavas, I.: A review of multi-label classification methods. In: Proceedings of the 2nd ADBIS workshop on Data Mining and Knowledge Discovery, pp. 99–109 (2006)

    Google Scholar 

  21. Wasim, M., Asim, M.N., Khan, M.U.G., Mahmood, W.: Multi-label biomedical question classification for lexical answer type prediction. J. Biomed. Inform. 93 (2019)

    Google Scholar 

  22. Weissenborn, D., Tsatsaronis, G., Schroeder, M.: Answering factoid questions in the biomedical domain. In: Proceedings of the 1st Workshop on Bio-Medical Semantic Indexing and Question Answering (2013)

    Google Scholar 

  23. Yang, Z., Gupta, N., Sun, X., Xu, D., Zhang, C., Nyberg, E.: Learning to answer biomedical factoid & list questions: OAQA at BioASQ 3B. In: Proceedings of Working Notes of CLEF 2015 Conference and Labs of the Evaluation Forum (2015)

    Google Scholar 

  24. Yao, Y., Zhou, B.: Micro and macro evaluation of classification rules. In: Proceedings of 7th IEEE International Conference on Cognitive Informatics, pp. 441–448 (2008)

    Google Scholar 

  25. Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2013)

    Article  Google Scholar 

Download references

Acknowledgement

This work was partially supported by State Key Laboratory of Software Development Environment of China (No. SKLSDE-2019ZX-16).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenge Rong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Peng, K., Rong, W., Li, C., Hu, J., Xiong, Z. (2020). Weight Aware Feature Enriched Biomedical Lexical Answer Type Prediction. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-63836-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-63835-1

  • Online ISBN: 978-3-030-63836-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics