Abstract
In this paper, we propose a novel deep attention neural network. This network is designed for multi-task learning. The network is composed of multiple dual-attention layers of attention over neighbors and tasks. The neighbor attention layer represents each data point by paying attention over its neighboring data points, i.e., the output of this layer of a data point is the weighted average of its neighbors’ inputs, and the weighting scores are calculated according to the similarity between the data point and its neighbors. The task attention layer takes the output of the neighbor attention layer as input, and transfer it to multiple task-specific representations for an input data point, and uses attention mechanism to calculate the outputs for different tasks. The output of the input data points for a task is calculated by a weighted average over all the task-specific representations, and the weighting scores are based on the similarity between the target task and the other tasks. The outputs of the neighbor attention layer and task attention layer are concatenated as the output of one dual-attention. To train the parameters of the network, we minimize the classification losses and encourage the correlation among different tasks. The experiments over the benchmark of multi-task learning show the advantage of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends R⃝ Mach. Learn. 3(1), 1–122 (2011)
Deng, Y., et al.: Multi-task learning with multi-view attention for answer selection and knowledge base question answering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6318–6325 (2019)
Du, H., Yang, S.J.: Discovering collaborative cyber attack patterns using social network analysis. In: Salerno, J., Yang, S.J., Nau, D., Chai, S.-K. (eds.) SBP 2011. LNCS, vol. 6589, pp. 129–136. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-19656-0_20
Du, H., Yang, S.J.: Probabilistic inference for obfuscated network attack sequences. In: 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp. 57–67 (2014)
Evgeniou, T., Pontil, M.: Regularized multi–task learning. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 109–117. ACM (2004)
Gao, H., Ji, S.: Graph representation learning via hard and channel-wise attention networks. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 741–749. ACM, New York (2019)
Geng, Y., et al.: Learning convolutional neural network to maximize pos@ top performance measure. In: ESANN 2017 - Proceedings, pp. 589–594 (2016)
Geng, Y., et al.: A novel image tag completion method based on convolutional neural transformation. In: Lintas, A., Rovetta, S., Verschure, P.F.M.J., Villa, A.E.P. (eds.) ICANN 2017. LNCS, vol. 10614, pp. 539–546. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68612-7_61
Hao, Y., et al.: An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 221–231 (2017)
Jabi, M., Pedersoli, M., Mitiche, A., Ben Ayed, I.: Deep clustering: on the link between discriminative models and k-means. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2019). https://doi.org/10.1109/tpami.2019.2962683
Lan, M., Wang, J., Wu, Y., Niu, Z.Y., Wang, H.: Multi-task attention-based neural networks for implicit discourse relationship representation and identification. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1299–1308 (2017)
Li, S., et al.: Deep residual correction network for partial domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2020). https://doi.org/10.1109/tpami.2020.2964173
Lin, Y., Liu, Z., Sun, M.: Neural relation extraction with multi-lingual attention. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 34–43 (2017)
Liu, D., Lian, J., Qiao, Y., Chen, J.H., Sun, G., Xie, X.: Fast and accurate knowledge-aware document representation enhancement for news recommendations (2019)
Liu, S., Johns, E., Davison, A.J.: End-to-end multi-task learning with attention. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1871–1880 (2019)
Mo, H., Lee, J.: Spatial community search using pagerank vector. In: 2019 International Conference on Computing, Networking and Communications (ICNC), pp. 792–796. IEEE (2019)
Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1891–1898 (2014)
Tang, X., Wang, T., Yang, H., Song, H.: AKUPM: attention-enhanced knowledge-aware user preference model for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 19, pp. 1891–1899. ACM, New York (2019). https://doi.org/10.1145/3292500.3330705
Wang, X., He, X., Cao, Y., Liu, M., Chua, T.S.: KGAT: knowledge graph attention network for recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 950–958. ACM, New York (2019)
Wang, Z., He, Z., Shah, M., Zhang, T., Fan, D., Zhang, W.: Network-based multi-task learning models for biomarker selection and cancer outcome prediction. Bioinformatics 36(6), 1814–1822 (2020)
Yu, T., Li, Y., Li, B.: Deep learning of determinantal point processes via proper spectral sub-gradient. In: International Conference on Learning Representations, ICLR, vol. 20 (2020)
Yu, T., Wang, R., Yan, J., Li, B.: Learning deep graph matching with channel independent embedding and hungarian attention. In: International Conference on Learning Representations, ICLR, vol. 20 (2020)
Zanca, D., Melacci, S., Gori, M.: Gravitational laws of focus of attention. IEEE Trans. Pattern Anal. and Mach. Intell. 1 (2019). https://doi.org/10.1109/tpami.2019.2920636
Zhang, G., et al.: Learning convolutional ranking-score function by query preference regularization. In: Yin, H., et al. (eds.) IDEAL 2017. LNCS, vol. 10585, pp. 1–8. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68935-7_1
Zhang, G., Liang, G., Su, F., Qu, F., Wang, J.-Y.: Cross-domain attribute representation based on convolutional neural network. In: Huang, D.-S., Gromiha, M.M., Han, K., Hussain, A. (eds.) ICIC 2018. LNCS (LNAI), vol. 10956, pp. 134–142. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-95957-3_15
Zhang, W., Zhu, M., Derpanis, K.G.: From actemes to action: a strongly-supervised representation for detailed action understanding. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2248–2255 (2013)
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning deep representation for face alignment with auxiliary attributes. IEEE Trans. Pattern Anal. Mach. Intell. 38(5), 918–930 (2015)
Zhao, J., Du, B., Sun, L., Zhuang, F., Lv, W., Xiong, H.: Multiple relational attention network for multi-task learning. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 19, pp. 1123–1131. ACM, New York (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Liang, G., Mo, H., Qiao, Y., Wang, C., Wang, JY. (2020). Paying Deep Attention to Both Neighbors and Multiple Tasks. In: Huang, DS., Bevilacqua, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2020. Lecture Notes in Computer Science(), vol 12463. Springer, Cham. https://doi.org/10.1007/978-3-030-60799-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-60799-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60798-2
Online ISBN: 978-3-030-60799-9
eBook Packages: Computer ScienceComputer Science (R0)