Elsevier

Future Generation Computer Systems

Volume 76, November 2017, Pages 458-467
Future Generation Computer Systems

A semi-supervised social relationships inferred model based on mobile phone data

https://doi.org/10.1016/j.future.2016.11.027Get rights and content

Highlights

  • We extracted the mobile phone communication features from the network.

  • We used principal component analysis to achieve the dimensionality reduction.

  • We used the co-training style semi-supervised algorithm to train two classifiers.

  • We used the classifiers to obtain the relationship labels.

Abstract

Exploring the relationships of humans is an important study in the mobile communication network. But the relationship prediction accuracy is not good enough when the number of known relationship labels (e.g., “friend” and “colleague”) is small, especially when the number of different relation classes are imbalanced in the mobile communication network. To deal with issues, we present a semi-supervised social relationships inferred model. This model can infer the relationships based on a large amount of unlabeled data or a small amount of labeled data. The model is a co-training style semi-supervised model which is combined with the support vector machine and naive Bayes. The final relationship labels are decided by the two classifiers. The proposed model is evaluated by a real mobile communication network dataset and the experiment results show that the model is effective in relationship mining, especially when the relationship network is in a stable state.

Introduction

With the increase in mobile subscriptions, it has inevitably brought about a sharp increase in the amount of communication data, such as sensor data records, search records, social records and so on. Using these information to mine users’ behavior patterns  [1], [2] and social relationships  [3], [4], [5] has become a hot topic in the pervasive computing. Social relationships are important part of individual in a social network. During the past decade, some researchers use proximity sensor to mining location based social networks  [6], [7] and some use online social network service to mining communities  [8].

During those social networks, knowing the relationships among users in the mobile phone communication network can bring great benefits. It can be used as personalized service recommendations based on relationships, better understanding of the changes in the dynamics of the social structure, automatic group phone contact, and so on  [9]. Nowadays, each mobile phone has the functionality to group contacts, but this functionality is hardly used. A survey by Grob et al.  [10] showed that only 16% of mobile phone users create any contact groups. In addition, users of social network sites are laborious to construct social groups (e.g. ‘circles’ on Google+, and ‘lists’ on Facebook and Twitter)  [11]. We think the reason for this phenomenon is that categorizing the relationship with contacts is time-consuming and a waste of effort. So how to classify the user’s relationships is a worthy study subject  [12]. Because the relationship labels in the communication network are seldom known, the challenge we face is how to use the small labeled relationships set and the large amount of mobile phone communication information to infer the huge unlabeled relationships set in the mobile phone communication network.

To solve this challenge, we propose a semi-supervised model to solve these problems, the model can only use a small labeled relationships set, and a large amount of mobile phone communication information to infer social relationships with high accuracy. First, we extracted mobile phone communication features from the mobile phone communication network. Second, we used principal component analysis (PCA) to achieve the dimensionality reduction. Third, we used the co-training style  [13] semi-supervised algorithm to train two classifiers. Finally we used these classifiers and the structure of relationship network (for example social balance  [14]) to obtain the final relationship labels. The contributions of this paper can be summarized as below:

  • A co-training style semi-supervised social relationship inferred model is proposed.

  • We evaluated our model on the real dataset: MIT Reality Mining  [15].

  • The average accuracy is improved than the supervised model when the labeled dataset is small, demonstrating that our method is more stable.

  • Our model achieved a greater improvement than other semi-supervised models, especially when the relation network is in a stable state.

The rest of this paper is organized as follows. We review the related works in the next section. In Section  3, we introduce the model framework of inferred relationships based on mobile communication network data. In Section  4, we express the model formally and introduce the inference process specifically. In Section  5, we test and verify the model on a real dataset and give the analysis of experimental results. Finally, in Section  6, we conclude our work and mention future work.

Section snippets

Related work

The social relationship is the core construct of sociology. Relational ties among the social network are channels for the transfer and “flow” of resources  [16]. However, the known relationships that exist in a social network are sparse and it is necessary to infer relationships from some of the information observed in the social network  [17], [18]. Traditionally, tie strength prediction  [17], [19], [20], [21] and specific semantic relationships inference  [22], [23], [15] have been two

Model overview

From the mobile communication network shown in Fig. 1(a), we can obtain the mobile phone call pattern, message sent and received pattern, encounter pattern (inferred from Bluetooth), location pattern, and so on. In this paper all these patterns are called mobile communication pattern (MCP). According to the MCP, we can infer the relationships between two users shown in Fig. 1(b). For example, if user a and user b stayed in the same office for a long time every day and the relationship between

Formal description of the semi-supervised model

A mobile communication network can be represented as G=(RL,RU,E,V), where RL is the known relationships set and RU is the unknown relationships set which is the result we want to achieve; edge set E is the MCP feature set of user set V in this network, as the MCP from user i to user j may be different from the MCP from user j to user i, ei,j=ej,i is not necessarily equal in E, ei,j is a multidimensional vector and every dimension denotes a communication feature between user i and j, as some

Experiment and discussion

In this section we will look the experimental setup and the performance of the model we proposed on a real dataset.

Conclusion

In this paper, we modeled relationship prediction based on mobile phone data. First we described the framework of the relationship prediction based on mobile phone data, then we focused on the relationship prediction step and proposed an SVM+NB+B semi-supervised model to infer the relationship. The model combines two classic classification methods, these two classifiers select confident pseudo-labeled data for each other as a co-training style method. We added graphic structure information when

Acknowledgments

The work is partly supported by NSFC (No. 61472149), the Fundamental Research Funds for the Central Universities (2015QN67), the Wuhan Youth Science and Technology Plan (2016070204010132) and the National 863 Hi-Tech Research and Development Program under grant (2015AA01A203).

References (37)

  • D. Yao et al.

    Human mobility synthesis using matrix and tensor factorizations

    Inf. Fusion

    (2015)
  • Z. Jiang et al.

    A hybrid generative/discriminative method for semi-supervised classification

    Knowl.-Based Syst.

    (2013)
  • T. Huynh, M. Fritz, B. Schiele, Discovery of activity patterns using topic models, in: Proceedings of the 10th...
  • J. Bonneau, J. Anderson, R. Anderson, F. Stajano, Eight friends are enough: Social graph approximation via public...
  • W. Tang et al.

    Learning to infer social ties in large networks

  • N. Eagle et al.

    Inferring friendship network structure by using mobile phone data

    Proc. Natl. Acad. Sci.

    (2009)
  • D. Quercia, L. Capra, Friendsensing: recommending friends using mobile phones, in: Proceedings of the 2009 ACM...
  • R. Zhang, Y. Zhang, J. Sun, G. Yan, Fine-grained private matching for proximity-based mobile social networking, in:...
  • B. Guo et al.

    Cross-community sensing and mining

    IEEE Commun. Mag.

    (2014)
  • C. Howden et al.

    Virtual vignettes: the acquisition, analysis, and presentation of social network data

    Sci. China Inf. Sci.

    (2014)
  • R. Grob, M. Kuhn, R. Wattenhofer, M. Wirz, Cluestr: Mobile social networking for enhanced group communication, in:...
  • J.J. McAuley, J. Leskovec, Learning to discover social circles in ego networks, in: Proceedings of the 26th Annual...
  • B. Zou et al.

    The learning performance of support vector machine classification based on markov sampling

    Sci. China Inf. Sci.

    (2013)
  • S.A. Goldman, Y. Zhou, Enhancing supervised learning with unlabeled data, in: Proceedings of the 17th International...
  • D. Easley et al.

    Networks, Crowds, and Markets: Reasoning about a Highly Connected World

    (2010)
  • N. Eagle et al.

    Reality mining: sensing complex social systems

    Pers. Ubiquitous Comput.

    (2006)
  • X. Ruan, E.G. Ochieng, A.D.F. Price, The evaluation of social network analysis applications in the UK construction...
  • R.H. Binstock et al.

    Handbook of Aging and the Social Sciences

    (2011)
  • Cited by (6)

    • Impact of psychological abuse on children and social skill development

      2021, Aggression and Violent Behavior
      Citation Excerpt :

      Individual, macroeconomic, institutional, and socioeconomic factors integrate this harm with the family in this immersive environment. In disaster-affected countries, more than 530 million children live, about a quarter of the world's children (Yu et al., 2017). Tragedies may contribute to disrupted social systems, typically used to shield children, leading to increased exposure and the risk of abuse, especially against children.

    • Smart City and IoT

      2017, Future Generation Computer Systems
    • Machine learning for phone-based relationship estimation: The need to consider population heterogeneity

      2019, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
    View full text