Skip to main content

Finding All Shortest Meaningful Meta-Paths Between Two Vertices of a Secured Large Heterogeneous Information Network Using Distributed Algorithm

  • Chapter
  • First Online:
Robotics and AI for Cybersecurity and Critical Infrastructure in Smart Cities

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1030))

  • 625 Accesses

Abstract

Discovering relationships between vertices in a secured information network is an important task in information network analysis. In HIN, meta-path, or a sequence of vertex types and edge types connecting two vertices. Path instance of a meta-path is path in HIN that satisfies the meta-path. The length of meta-path is the number of relations (edges) in this meta-path. Meaningful meta-path is a meta-path with at least one path instance. Recent works on meta-path discovery mainly focus on in-memory algorithms that fit in only one computer. In this chapter, we propose distributed algorithms to discover all shortest meaningful meta-paths between two vertices of a large HIN using Apache Spark. Shortest meaningful meta-path is a meaningful meta-path with shortest length. We employ a scalable implementation of the Distributed Breadth-First Search (D-BFS) algorithm as a baseline approach. Finding all possible shortest paths in a large HIN can be time consuming. Therefore, we propose a novel algorithm called shortest meaningful meta-path based search (S-MPS). S-MPS first searches all shortest meta-path candidates between vertices in the graph of the network schema of HIN. We conduct experiments on DBLP data set to prove the efficiency of our proposed S-MPS algorithm over D-BFS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Sun, Y., Han, J., Yan, X., Yu, P., Wu, T.: Path-Sim: meta path-based top-k similarity search in heterogeneous information networks. In: VLDB, pp. 992–1003 (2011). https://doi.org/10.14778/3402707.3402736

  2. Shi, C., Li, Y., Zhang, J., Sun, Y., Yu, P.S.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. (2017). https://doi.org/10.1109/TKDE.2016.2598561

    Article  Google Scholar 

  3. Phan, T., Do, P.: Building a Vietnamese question answering system based on knowledge graph and distributed CNN. Neural Comput. Appl. 33, 14887–14907 (2021). https://doi.org/10.1007/s00521-021-06126-z

    Article  Google Scholar 

  4. Do, P., Pham, P.: DW-PathSim: a distributed computing model for topic-driven weighted meta-path-based similarity measure in a large-scale content-based heterogeneous information network. J. Inf. Telecommun. 3(1), 19–38 (2019). https://doi.org/10.1080/24751839.2018.1516714

    Article  Google Scholar 

  5. Salhi, D., Tari, A., Kechadi, T.: Using clustering for forensics analysis on internet of things. Int. J. Softw. Sci. Comput. Intell. (2021)

    Google Scholar 

  6. Kong, X., Cao, B., Yu, P., Ding, Y., Wild, D.: Meta path-based collective classification in heterogeneous information. Networks (2013). https://doi.org/10.1145/2396761.2398474

    Article  Google Scholar 

  7. Trappey, A.J., Trappey, C.V., Chang, A., Li, J.X.: Deriving competitive foresight using an ontology-based patent roadmap and valuation analysis. In: International Journal on Semantic Web and Information Systems, pp. 68–91 (2019). https://doi.org/10.4018/IJSWIS.2019040104

  8. Ho, T., Do, P.: Discovering communities of users on social networks based on topic model combined with Kohonen network. In: Seventh International Conference on Knowledge and Systems Engineering, pp. 268–273 (2015). https://doi.org/10.1109/KSE.2015.54.

  9. Do, P.: A system for natural language interaction with the heterogeneous information network. In: Handbook of Research on Cloud Computing and Big Data Applications in IoT (2019)

    Google Scholar 

  10. Besmir, S., Florie, I., Lule, A.: Integration of semantics into sensor data for the IoT: a systematic literature review. In: International Journal on Semantic Web and Information Systems (2020). https://doi.org/10.4018/IJSWIS.2020100101

  11. Meng, C., Cheng, R., Maniu, S., Senellart, P., Zhang, W.: Discovering meta-paths in large heterogeneous information networks. In: Proceedings of the 24th International Conference on World Wide Web (2015). https://doi.org/10.1145/2736277.2741123

  12. Liu, H., Jin, C., Yang, B., Zhou, A.: Finding Top-k shortest paths with diversity. In: IEEE Transactions on Knowledge and Data Engineering, pp. 488–502 (2018). https://doi.org/10.1109/TKDE.2017.2773492.

  13. Khekare, G., Verma, P., Dhanre, U., Raut, S., Sheikh, S: The optimal path finding algorithm based on reinforcement learning. In: International Journal of Software Science and Computational Intelligence (2020). https://doi.org/10.4018/IJSSCI.2020100101

  14. Iqbal, S., Hussain, I., Sharif, Z., Qureshi, K.H., Jabeen, J.: Reliable and energy-efficient routing scheme for underwater wireless sensor networks (UWSNs). In: International Journal of Cloud Applications and Computing (IJCAC) (2021). https://doi.org/10.4018/IJCAC.2021100103

  15. Zhu, Z., Cheng, R., Do, L., Huang, Z., Zhang, H.: Evaluating Top-k meta path queries on large heterogeneous information networks. In: IEEE International Conference on Data Mining, pp. 1470–1475 (2018). https://doi.org/10.1109/ICDM.2018.00204

  16. Drabas, T., Lee D.: Learning PySpark. Packt (2017)

    Google Scholar 

  17. Al-Nawasrah, A., Almomani, A.A., Atawneh, S., Alauthman, M.: A survey of fast flux botnet detection with fast flux cloud computing. In: International Journal of Cloud Applications and Computing (2021). https://doi.org/10.4018/IJCAC.2020070102

  18. Dave, A., Jindal, A., Liy, L.E., Xin, R., Gonzalez, J., Zaharia, M.: GraphFrames: an integrated API for mixing graph and relational queries. In: Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems (2016). https://doi.org/10.1145/2960414.2960416

  19. Koji, U., Toyotaro, S., Naoya, M., Katsuki, F., Satoshi, M.: Efficient breadth-first search on massively parallel and distributed-memory machines. In: Data Science and Engineering (2017). https://doi.org/10.1007/s41019-016-0024-y

  20. Shi, C., Li, Y., Zhang, J., Sun, Y.: A survey of heterogeneous information network analysis. IEEE Trans. Knowl. Data Eng. (2017)

    Google Scholar 

  21. Ni, L., William, C.: Fast query execution for retrieval models based on path-constrained random walks. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2010). https://doi.org/10.1145/1835804.1835916

  22. Chuan, S., Xiangnan, K., Yue, H., Philip, Y., Bin, W.: HeteSim: a general framework for relevance measure in heterogeneous networks. In: IEEE Transactions on Knowledge and Data Engineering, vol. 26 (2013). https://doi.org/10.1109/TKDE.2013.2297920

  23. Blei, D.M., Ng, A.Y., Michael, I.J.: Latent Dirichlet allocation. J. Mach. Learn. Res. (2003)

    Google Scholar 

  24. Lijun, C., Xuemin, L., Lu, Q., Jeffrey, X., Jian, P.: Efficiently computing Top-K shortest path join (2015)

    Google Scholar 

Download references

Acknowledgements

This research is funded by Vietnam National University Ho Chi Minh City (VNU-HCMC) under the grant number DS2020-26-01.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phuc Do .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Do, P. (2022). Finding All Shortest Meaningful Meta-Paths Between Two Vertices of a Secured Large Heterogeneous Information Network Using Distributed Algorithm. In: Nedjah, N., Abd El-Latif, A.A., Gupta, B.B., Mourelle, L.M. (eds) Robotics and AI for Cybersecurity and Critical Infrastructure in Smart Cities. Studies in Computational Intelligence, vol 1030. Springer, Cham. https://doi.org/10.1007/978-3-030-96737-6_10

Download citation

Publish with us

Policies and ethics