Abstract
With the popularity of mobile devices, large amounts of mobile applications (a.k.a.“app”) have been developed and published. Detecting similar apps from a large pool of apps is a fundamental and important task because it has many benefits for various purposes. There exist several works that try to combine different metadata of apps for measuring the similarity between apps. However, few of them pay attention to the roles of this service. Besides, existing methods do not distinguish the characters of contents in the metadata. Therefore, it is hard to obtain accurate semantic representations of apps and capture their fine-grained correlations. In this paper, we propose a novel framework by knowledge graph (KG) techniques and a hybrid embedding strategy to fill above gaps. For the construction of KG, we design a lightweight ontology tailored for the service of cybersecurity analysts. Benefited from a defined schema, more linkages can be shared among apps. To detect similar apps, we divide the relations in KG into structured and unstructured ones according to their related content. Then, TextRank algorithm is employed to extract important tokens from unstructured texts and transform them into structured triples. In this way, the representations of apps in our framework can be iteratively learned by combining KG embedding methods and network embedding models for improving the performance of similar apps detection. Preliminary results indicate the effectiveness of our method comparing to existing models in terms of reciprocal ranking and minimum ranking.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
In this paper, KG embedding methods are employed by translated-based methods, and NE models mainly consider the effects of out neighbors of nodes in the network.
- 9.
- 10.
- 11.
References
Meng, G., Patrick, M., Xue, Y., Liu, Y., Zhang, J.: Securing Android app markets via modeling and predicting malware spread between markets. IEEE Trans. Inf. Forensics Secur. 14(7), 1944–1959 (2019)
Chen, N., Hoi, S.C., Li, S., Xiao, X.: SimApp: a framework for detecting similar mobile applications by online kernel learning. In: WSDM, pp. 305–314 (2015)
Bhandari, U., Sugiyama, K., Datta, A., Jindal, R.: Serendipitous recommendation for mobile apps using item-item similarity graph. In: Banchs, R.E., Silvestri, F., Liu, T.-Y., Zhang, M., Gao, S., Lang, J. (eds.) AIRS 2013. LNCS, vol. 8281, pp. 440–451. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-45068-6_38
Yin, P., Luo, P., Lee, W.-C., Wang, M.: App recommendation: a contest between satisfaction and temptation. In: WSDM, pp. 395–404 (2013)
Park, D.H., Liu, M., Zhai, C., Wang, H.: Leveraging user reviews to improve accuracy for mobile app retrieval. In: SIGIR, pp. 533–542 (2015)
Lin, J., Sugiyama, K., Kan, M.-Y., Chua, T.-S.: Scrutinizing mobile app recommendation: identifying important app-related indicators. In: Ma, S., et al. (eds.) AIRS 2016. LNCS, vol. 9994, pp. 197–211. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-48051-0_15
Al-Subaihin, A., Sarro, F., Black, S., Capra, L.: Empirical comparison of text-based mobile apps similarity measurement techniques. Empirical Softw. Eng. 24(6), 3290–3315 (2019). https://doi.org/10.1007/s10664-019-09726-5
Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP, pp. 404–411 (2004)
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Cui, P., Wang, X., Pei, J., Zhu, W.: A survey on network embedding. IEEE Trans. Knowl. Data Eng. 31(5), 833–852 (2019)
Geiger, F.-X., Malavolta, I.: Datasets of Android applications: a literature review. CoRR, abs/1809.10069 (2018)
Arp, D., Spreitzenbarth, M., Hubner, M., Gascon, H., Rieck, K.: DREBIN: effective and explainable detection of Android malware in your pocket. In: NDSS (2014)
Li, L., et al.: AndroZoo++: collecting millions of Android apps and their metadata for the research community. CoRR, abs/1709.05281 (2017)
Meng, G., Xue, Y., Siow, J.K., Su, T., Narayanan, A., Liu, Y.: AndroVault: constructing knowledge graph from millions of Android apps for automated analysis. CoRR, abs/1711.07451 (2017)
Niu, X., Sun, X., Wang, H., Rong, S., Qi, G., Yu, Y.: Zhishi.me - weaving Chinese linking open data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7032, pp. 205–220. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25093-4_14
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS, pp. 2787–2795 (2013)
Gao, Y., Yue, X., Huang, H., Liu, Q., Wei, L., Liu, L.: Jointly learning topics in sentence embedding for document summarization. IEEE Trans. Knowl. Data Eng. 32(4), 688–699 (2020)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL, pp. 4171–4186 (2019)
Tang, J., Qu, M., Mei, Q.: PTE: predictive text embedding through large-scale heterogeneous text networks. In: SIGKDD, pp. 1165–1174 (2015)
Wang, J., Huang, P., Zhao, H., Zhang, Z., Zhao, B., Lee, D.L.: Billion-scale commodity embedding for e-commerce recommendation in Alibaba. In: SIGKDD, pp. 839–848 (2018)
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: AAAI, pp. 1112–1119 (2014)
Ji, G., He, S., Xu, L., Liu, K., Zhao, J.: Knowledge graph embedding via dynamic mapping matrix. In: ACL, pp. 687–696 (2015)
Wang, M., Wang, R., Liu, J., Chen, Y., Zhang, L., Qi, G.: Towards empty answers in SPARQL: approximating querying with RDF embedding. In: Vrandečić, D., et al. (eds.) ISWC 2018. LNCS, vol. 11136, pp. 513–529. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00671-6_30
Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social representations. In: SIGKDD, pp. 701–710 (2014)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: LINE: large-scale information network embedding. In: WWW, pp. 1067–1077 (2015)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: SIGKDD, pp. 855–864 (2016)
Acknowledgements
This work was partially supported by the Natural Science Foundation of China grants (U1736204, 61906037), the National 242 Information Security Plan grant (6909001165).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, W., Zhang, B., Xu, L., Wang, M., Luo, A., Niu, Y. (2020). Combining Knowledge Graph Embedding and Network Embedding for Detecting Similar Mobile Applications. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12430. Springer, Cham. https://doi.org/10.1007/978-3-030-60450-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-60450-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60449-3
Online ISBN: 978-3-030-60450-9
eBook Packages: Computer ScienceComputer Science (R0)