Skip to main content

Paying Deep Attention to Both Neighbors and Multiple Tasks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12463))

Abstract

In this paper, we propose a novel deep attention neural network. This network is designed for multi-task learning. The network is composed of multiple dual-attention layers of attention over neighbors and tasks. The neighbor attention layer represents each data point by paying attention over its neighboring data points, i.e., the output of this layer of a data point is the weighted average of its neighbors’ inputs, and the weighting scores are calculated according to the similarity between the data point and its neighbors. The task attention layer takes the output of the neighbor attention layer as input, and transfer it to multiple task-specific representations for an input data point, and uses attention mechanism to calculate the outputs for different tasks. The output of the input data points for a task is calculated by a weighted average over all the task-specific representations, and the weighting scores are based on the similarity between the target task and the other tasks. The outputs of the neighbor attention layer and task attention layer are concatenated as the output of one dual-attention. To train the parameters of the network, we minimize the classification losses and encourage the correlation among different tasks. The experiments over the benchmark of multi-task learning show the advantage of the proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends R⃝ Mach. Learn. 3(1), 1–122 (2011)

    MATH  Google Scholar 

  2. Deng, Y., et al.: Multi-task learning with multi-view attention for answer selection and knowledge base question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6318–6325 (2019)

    Google Scholar 

  3. Du, H., Yang, S.J.: Discovering collaborative cyber attack patterns using social network analysis. In: Salerno, J., Yang, S.J., Nau, D., Chai, S.-K. (eds.) SBP 2011. LNCS, vol. 6589, pp. 129–136. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19656-0_20

    Chapter  Google Scholar 

  4. Du, H., Yang, S.J.: Probabilistic inference for obfuscated network attack sequences. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 57–67 (2014)

    Google Scholar 

  5. Evgeniou, T., Pontil, M.: Regularized multi–task learning. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 109–117. ACM (2004)

    Google Scholar 

  6. Gao, H., Ji, S.: Graph representation learning via hard and channel-wise attention networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 741–749. ACM, New York (2019)

    Google Scholar 

  7. Geng, Y., et al.: Learning convolutional neural network to maximize pos@ top performance measure. In: ESANN 2017 - Proceedings, pp. 589–594 (2016)

    Google Scholar 

  8. Geng, Y., et al.: A novel image tag completion method based on convolutional neural transformation. In: Lintas, A., Rovetta, S., Verschure, P.F.M.J., Villa, A.E.P. (eds.) ICANN 2017. LNCS, vol. 10614, pp. 539–546. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68612-7_61

    Chapter  Google Scholar 

  9. Hao, Y., et al.: An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 221–231 (2017)

    Google Scholar 

  10. Jabi, M., Pedersoli, M., Mitiche, A., Ben Ayed, I.: Deep clustering: on the link between discriminative models and k-means. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2019). https://doi.org/10.1109/tpami.2019.2962683

  11. Lan, M., Wang, J., Wu, Y., Niu, Z.Y., Wang, H.: Multi-task attention-based neural networks for implicit discourse relationship representation and identification. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1299–1308 (2017)

    Google Scholar 

  12. Li, S., et al.: Deep residual correction network for partial domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2020). https://doi.org/10.1109/tpami.2020.2964173

  13. Lin, Y., Liu, Z., Sun, M.: Neural relation extraction with multi-lingual attention. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 34–43 (2017)

    Google Scholar 

  14. Liu, D., Lian, J., Qiao, Y., Chen, J.H., Sun, G., Xie, X.: Fast and accurate knowledge-aware document representation enhancement for news recommendations (2019)

    Google Scholar 

  15. Liu, S., Johns, E., Davison, A.J.: End-to-end multi-task learning with attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1871–1880 (2019)

    Google Scholar 

  16. Mo, H., Lee, J.: Spatial community search using pagerank vector. In: 2019 International Conference on Computing, Networking and Communications (ICNC), pp. 792–796. IEEE (2019)

    Google Scholar 

  17. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1891–1898 (2014)

    Google Scholar 

  18. Tang, X., Wang, T., Yang, H., Song, H.: AKUPM: attention-enhanced knowledge-aware user preference model for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 19, pp. 1891–1899. ACM, New York (2019). https://doi.org/10.1145/3292500.3330705

  19. Wang, X., He, X., Cao, Y., Liu, M., Chua, T.S.: KGAT: knowledge graph attention network for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 950–958. ACM, New York (2019)

    Google Scholar 

  20. Wang, Z., He, Z., Shah, M., Zhang, T., Fan, D., Zhang, W.: Network-based multi-task learning models for biomarker selection and cancer outcome prediction. Bioinformatics 36(6), 1814–1822 (2020)

    Article  Google Scholar 

  21. Yu, T., Li, Y., Li, B.: Deep learning of determinantal point processes via proper spectral sub-gradient. In: International Conference on Learning Representations, ICLR, vol. 20 (2020)

    Google Scholar 

  22. Yu, T., Wang, R., Yan, J., Li, B.: Learning deep graph matching with channel independent embedding and hungarian attention. In: International Conference on Learning Representations, ICLR, vol. 20 (2020)

    Google Scholar 

  23. Zanca, D., Melacci, S., Gori, M.: Gravitational laws of focus of attention. IEEE Trans. Pattern Anal. and Mach. Intell. 1 (2019). https://doi.org/10.1109/tpami.2019.2920636

  24. Zhang, G., et al.: Learning convolutional ranking-score function by query preference regularization. In: Yin, H., et al. (eds.) IDEAL 2017. LNCS, vol. 10585, pp. 1–8. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68935-7_1

    Chapter  Google Scholar 

  25. Zhang, G., Liang, G., Su, F., Qu, F., Wang, J.-Y.: Cross-domain attribute representation based on convolutional neural network. In: Huang, D.-S., Gromiha, M.M., Han, K., Hussain, A. (eds.) ICIC 2018. LNCS (LNAI), vol. 10956, pp. 134–142. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95957-3_15

    Chapter  Google Scholar 

  26. Zhang, W., Zhu, M., Derpanis, K.G.: From actemes to action: a strongly-supervised representation for detailed action understanding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2248–2255 (2013)

    Google Scholar 

  27. Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning deep representation for face alignment with auxiliary attributes. IEEE Trans. Pattern Anal. Mach. Intell. 38(5), 918–930 (2015)

    Article  Google Scholar 

  28. Zhao, J., Du, B., Sun, L., Zhuang, F., Lv, W., Xiong, H.: Multiple relational attention network for multi-task learning. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 19, pp. 1123–1131. ACM, New York (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haoran Mo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, G., Mo, H., Qiao, Y., Wang, C., Wang, JY. (2020). Paying Deep Attention to Both Neighbors and Multiple Tasks. In: Huang, DS., Bevilacqua, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2020. Lecture Notes in Computer Science(), vol 12463. Springer, Cham. https://doi.org/10.1007/978-3-030-60799-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60799-9_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60798-2

  • Online ISBN: 978-3-030-60799-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics