Skip to main content

MTLAT: A Multi-Task Learning Framework Based on Adversarial Training for Chinese Cybersecurity NER

  • Conference paper
  • First Online:
Network and Parallel Computing (NPC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12639))

Included in the following conference series:

Abstract

With the continuous development of cybersecurity texts, the importance of Chinese cybersecurity named entity recognition (NER) is increasing. However, Chinese cybersecurity texts contain not only a large number of professional security domain entities but also many English person and organization entities, as well as a large number of Chinese-English mixed entities. Chinese Cybersecurity NER is a domain-specific task, current models rarely focus on the cybersecurity domain and cannot extract these entities well. To tackle these issues, we propose a Multi-Task Learning framework based on Adversarial Training (MTLAT) to improve the performance of Chinese cybersecurity NER. Extensive experimental results show that our model, which does not use any external resources except static word embedding, outperforms state-of-the-art systems on the Chinese cybersecurity dataset. Moreover, our model outperforms the BiLSTM-CRF method on Weibo, Resume, and MSRA Chinese general NER datasets by 4.1%, 1.04%, 1.79% F1 scores, which proves the universality of our model in different domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/xuanzebi/MTLAT.

  2. 2.

    https://github.com/xiebo123/NER.

References

  1. Aguilar, G., Maharjan, S., López-Monroy, A.P., Solorio, T.: A multi-task approach for named entity recognition in social media data. CoRR abs/1906.04135 (2019)

    Google Scholar 

  2. Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS (LNAI), vol. 10102, pp. 239–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-50496-4_20

    Chapter  Google Scholar 

  3. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)

    Google Scholar 

  4. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. CoRR abs/1508.01991 (2015)

    Google Scholar 

  5. Joshi, A., Lal, R., Finin, T., Joshi, A.: Extracting cybersecurity related linked data from text. In: 2013 IEEE Seventh International Conference on Semantic Computing, pp. 252–259. IEEE (2013)

    Google Scholar 

  6. Ju, Y., Zhao, F., Chen, S., Zheng, B., Yang, X., Liu, Y.: Technical report on conversational question answering. CoRR abs/1909.10772 (2019)

    Google Scholar 

  7. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. In: NAACL, pp. 260–270 (2016)

    Google Scholar 

  8. Levow, G.A.: The third international Chinese language processing bakeoff: word segmentation and named entity recognition. In: ACL, pp. 108–117. Association for Computational Linguistics (2006)

    Google Scholar 

  9. Li, H., Hagiwara, M., Li, Q., Ji, H.: Comparison of the impact of word segmentation on name tagging for Chinese and Japanese. In: LREC, pp. 2532–2536 (2014)

    Google Scholar 

  10. Li, X., Meng, Y., Sun, X., Han, Q., Yuan, A., Li, J.: Is word segmentation necessary for deep learning of Chinese representations? In: ACL, pp. 3242–3252. Association for Computational Linguistics (2019)

    Google Scholar 

  11. Liu, Z., Winata, G.I., Fung, P.: Zero-resource cross-domain named entity recognition. In: ACL, pp. 1–6 (2020)

    Google Scholar 

  12. Lu, Y., Zhang, Y., Ji, D.: Multi-prototype Chinese character embedding. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 855–859 (2016)

    Google Scholar 

  13. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)

    Google Scholar 

  14. Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. In: ICLR (2017)

    Google Scholar 

  15. Peng, N., Dredze, M.: Improving named entity recognition for Chinese social media with word segmentation representation learning. In: ACL: Short Papers (2016)

    Google Scholar 

  16. Qin, Y., Shen, G.W., Zhao, W.B., Chen, Y.P., Yu, M., Jin, X.: A network security entity recognition method based on feature template and CNN-BILSTM-CRF. Front. Inf. Technol. Electron. Eng. 20(6), 872–884 (2019)

    Article  Google Scholar 

  17. Song, Y., Shi, S., Li, J., Zhang, H.: Directional skip-gram: explicitly distinguishing left and right context for word embeddings. In: NAACL-HLT, (Short Papers), vol. 2, pp. 175–180 (2018)

    Google Scholar 

  18. Weerawardhana, S., Mukherjee, S., Ray, I., Howe, A.: Automated extraction of vulnerability information for home computer security. In: Cuppens, F., Garcia-Alfaro, J., Zincir Heywood, N., Fong, P.W.L. (eds.) FPS 2014. LNCS, vol. 8930, pp. 356–366. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-17040-4_24

    Chapter  Google Scholar 

  19. Zhang, Y., Yang, J.: Chinese NER using lattice LSTM. In: ACL, pp. 1554–1564 (2018)

    Google Scholar 

  20. Zhao, S., Liu, T., Zhao, S., Wang, F.: A neural multi-task learning framework to jointly model medical named entity recognition and normalization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 817–824 (2019)

    Google Scholar 

  21. Zhu, C., Cheng, Y., Gan, Z., Sun, S., Goldstein, T., Liu, J.: FreeLB: enhanced adversarial training for language understanding. In: ICLR (2020)

    Google Scholar 

Download references

Acknowledgments

This research is supported by National Key Research and Development Program of China (No.2019QY1303, No.2019QY1301, No.2018YFB0803602), and the Strategic Priority Research Program of the Chinese Academy of Sciences (No. XDC02040100), and National Natural Science Foundation of China (No. 61702508, No. 61802404). This work is also supported by the Program of Key Laboratory of Network Assessment Technology, the Chinese Academy of Sciences; Program of Beijing Key Laboratory of Network Security and Protection Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Han, Y. et al. (2021). MTLAT: A Multi-Task Learning Framework Based on Adversarial Training for Chinese Cybersecurity NER. In: He, X., Shao, E., Tan, G. (eds) Network and Parallel Computing. NPC 2020. Lecture Notes in Computer Science(), vol 12639. Springer, Cham. https://doi.org/10.1007/978-3-030-79478-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-79478-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-79477-4

  • Online ISBN: 978-3-030-79478-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics