Abstract
Peer-to-peer storage services are a cost-effective alternative for data backup. A basic question that arises in the design of such systems is: In which peers do we store redundant data? Choosing appropmailriate peers for data backup is important at a microscopic level, from an end-user’s perspective to guarantee good performance, e.g., quick access, high availability, etc., as well as at a macroscopic level, e.g., for system optimization, fairness, etc. Existing systems apply different techniques, including random selection, based on a distributed hash table (DHT) or based on the peers’ past availability pattern. In this paper, we propose as an alternative, a contextual trust based data placement scheme to select suitable data holders. It is originally designed for and applicable to scenarios where there is inadequate historical information about peers, a common scenario in large-scale systems. Specifically, our scheme estimates trustworthiness of a peer based on stereotypes, formed by aggregating information of interactions with other (similar) peers. Simulation experiments show that our placement scheme outperforms not only random selection but also schemes using historical information, in terms of both achieved data availability as well as bandwidth overheads to sustain the system.






Similar content being viewed by others
Notes
Peers leave the system permanently.
Note that for backup storage application, typically only the data owner needs access to the data. However, if multiple users are to access the data round the clock, storing at peers with different time-zones is desirable [28]. The proposed approach can be applied for such a scenario as well, but then the function needs to be changed.
Dealing with and preventing such malicious behaviors will require other security mechanisms, possibly including conventional trust based approaches. That is a somewhat orthogonal issue, beyond the scope of the presented work, where we are using trust abstraction as an alternative to explicit multi-objective optimization of the system.
The data transfer rate is influenced by many factors such as bandwidth, connection type, ISP latency, etc. However, for simplicity, we assume that the data transfer rate is only determined by bandwidth in the simulation.
We define a session as a process that a peer joins the system, contributes/provides resources and leaves the system.
Note that in a real PlanetLab trace, node’s online pattern is already known, while in synthetic trace, node’s availability is generated artificially according to its time zones, time of day effects, and unique failure model characterized by the inter-arrival time and session length (see Sect. 4.1.2) and is hence, again known a priori for the experiments
References
Christopher, B., Kenneth, B., Arvind, S., Stanley, T.: pStore: a secure peer-to-peer backup system. Technical Memo MIT-LCS-TM-632, Massachusetts Institute of Technology, Laboratory for Computer Science (2002)
Cox, L.P., Murray, C.D., Noble, B.D.: Pastiche: making backup cheap and easy. In: Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI), pp. 285–298 (2002)
Landon, P.C, Brian, D.N.: Samsara: honor among thieves in peer-to-peer storage. In: Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (SOSP), pp. 120–132 (2003)
Landers, M., Zhang, H., Tan, K.-L.: Peerstore: better performance by relaxing in peer-to-peer backup. In: Proceedings of the Fourth International Conference on Peer-to-Peer Computing (P2P), pp. 72–79 (2004)
Lillibridge, M., Elnikety, S., Birrell, A., Burrows, M., Isard, M.: A cooperative internet backup scheme. In: Proceedings of the Annual Conference on USENIX Annual Technical Conference (ATEC), pp. 29–41 (2003)
Xin, Q., Schwarz, T., Miller, E.L.: Availability in global peer-to-peer storage systems. In: Proceedings of the 6th Workshop on Distributed Data and Structures (WDAS) (2004)
Patterson, D.A., Gibson, G., Katz, R.H.: A case for redundant arrays of inexpensive disks (raid). In: Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 109–116 (1988)
Bhagwan, R., Tati, K., Cheng, Y.-C., Savage, S., Voelker, G.M.: Total recall: system support for automated availability management. In: Proceedings of the 1st Conference on Symposium on Networked Systems Design and Implementation (NSDI), pp. 337–350 (2004)
Datta, A., Aberer, K.: Internet-scale storage systems under churn—a steady state analysis. In: The Sixth IEEE International Conference on Peer-to-Peer Computing (P2P), pp. 133–144 (2006)
Sit, E., Haeberlen, A., Dabek, F., Chun, B.-G., Weatherspoon, H., Morris, R., Kaashoek, M.F., Kubiatowicz, J.: Proactive replication for data durability. In: Proceedings of the Internaltional Workshop on Peer-to-Peer Systems (IPTPS) (2006)
Duminuco, A., Biersack, E., En-Najjary, T.: Proactive replication in distributed storage systems using machine availability estimation. In CoNEXT ’07: Proceedings of the 2007 ACM CoNEXT Conference, pp. 1–12 (2007)
Liu, X., Datta, A., Razdca, K., Lim, E.-P.: Stereotrust: a group based personalized trust model. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management (CIKM), pp. 7–16 (2009)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup service for internet applications. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (SIGCOMM), pp. 149–160 (2001)
Rowstron, A.I.T., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg (Middleware), pp. 329–350 (2001)
Tran, D.N., Chiang, F., Li, J.: Friendstore: cooperative online backup using trusted nodes. In: Proceedings of the 1st Workshop on Social Network Systems (SocialNets), pp. 37–42 (2008)
Chun, B.-G., Dabek, F., Haeberlen, A., Sit, E., Weatherspoon, H., Kaashoek, M.F., Kubiatowicz, J., Morris, R.: Efficient replica maintenance for distributed storage systems. In: Proceedings of the 3rd Conference on Networked Systems Design and Implementation (NSDI), pp. 45–58, Berkeley, CA, USA. USENIX Association (2006)
Blond, S., Fessant, F., Merrer, E.: Finding good partners in availability-aware p2p networks. In: SSS ’09: Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 472–484. Springer-Verlag, Berlin (2009)
Mui, L., Mohtashem, M.: A computational model of trust and reputation. In: Proceedings of the 35th Annual Hawaii International Conference on System Sciences (HICSS), pp. 2431–2439 (2002)
Jøsang, A., Ismail, R.: The beta reputation system. In: The 15th Bled Electronic Commerce Conference (2002)
Xiong, L., Liu, L.: Peertrust: supporting reputation-based trust for peer-to-peer electronic communities. IEEE Trans. Knowl. Data Eng. 16, 843–857 (2004)
Kubiatowicz, J., Bindel, D., Chen, Y., Czerwinski, S., Eaton, P., Geels, D., Gummadi, R., Rhea, S., Weatherspoon, H., Weimer, W., Wells, C., Zhao, B.: OceanStore: an architecture for global-scale persistent storage. In: Proceeedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 190–201 (2000)
Diego, G.: Can We Trust Trust? Trust: Making and Breaking Cooperative Relations. Basil Blackwell, Oxford (1988)
Marsh, S.P.: Formalising Trust as a Computational Concept, PhD thesis, University of Stirling (1994)
Audun, J., Roslan, I., Colin, B.: A survey of trust and reputation systems for online service provision. Decis. Support Syst. 43, 618–644 (2007)
Teacy, W.T., Patel, J., Jennings Nicholas, R., Luck, M.: Travos: trust and reputation in the context of inaccurate information sources. J. Auton. Agents Multi-Agent Syst. 12, 183–198 (2006)
Bhagwan, R., Savage, S., Voelker, G.: Understanding availability. In: Peer-to-Peer Systems II, volume 2735 of Lecture Notes in Computer Science, pp. 256–267. Springer, Berlin (2003)
Chandy, J.A.: Storage allocation in unreliable peer-to-peer systems. In: DSN ’06: Proceedings of the International Conference on Dependable Systems and Networks, pp. 227–236, Washington, DC, USA. IEEE Computer Society (2006)
Rzadca, K., Datta, A., Buchegger, S.: Replica placement in p2p storage: complexity and game theoretic analyses. In: The 30th International Conference on Distributed Computing Systems (ICDCS), pp. 599–609 (2010)
Godfrey, P.B., Shenker, S., Stoica, I.: Minimizing churn in distributed systems. SIGCOMM Comput. Commun. Rev. 36(4), 147–158 (2006)
Ahlswede, R., Cai, N., Li Shuo-Yen, R., Yeung, R.W.: Network information flow. IEEE Trans. Inf. Theory 46, 1204–1216 (2000)
IPAddressGuide.com. http://www.ipaddressguide.com/
Medina, A., Lakhina, A., Matta, I., Byers, J.: Brite: an approach to universal topology generation. In: Proceedings of the Ninth International Symposium in Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 346–353, Washington, DC, USA. IEEE Computer Society (2001)
Waxman, B.M.: Routing of multipoint connections. Sel. Areas Commun. IEEE J. 6(9), 1617–1622 (1988)
Stutzbach, D., Rejaie, R.: Understanding churn in peer-to-peer networks. In: Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, pp. 189–202 (2006)
Xin, L., Datta, A.: Redundancy maintenance and garbage collection strategies in peer-to-peer storage systems. In: Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS), pp. 515–530 (2009)
Tati, K., Voelker, G.M.: On object maintenance in peer-to-peer systems. In: The 5th International Workshop on Peer-to-Peer Systems (IPTPS) (2006)
Clarke, I., Sandberg, O., Wiley, B., Hong, T.W.: Freenet: a distributed anonymous information storage and retrievalsystem. In: International Workshop on Designing Privacy Enhancing Technologies, pp. 46–66 (2000)
Harvesf, C., Blough, D.M.: The effect of replica placement on routing robustness in distributed hash tables. In: the Sixth IEEE International Conference on Peer-to-Peer Computing (P2P), pp. 57–64 (2006)
Freedman, M.J., Lakshminarayanan, K., Rhea, S., Stoica, I.: Non-transitive connectivity and dhts. In: Proceedings of the 2nd Conference on Real, Large Distributed Systems (WORLDS), pp. 55–60 (2005)
Haiying, S., Chengzhong, X.: Locality-aware and churn-resilient load-balancing algorithms in structured peer-to-peer networks. IEEE Trans. Parallel Distrib. Syst. 18, 849–862 (2007)
Bernard, S., Le Fessant, F.: Optimizing peer-to-peer backup using lifetime estimations. In: EDBT/ICDT ’09: Proceedings of the 2009 EDBT/ICDT Workshops, pp. 26–33, New York, NY, USA. ACM (2009)
Kamvar, S.D., Schlosser, M.T., Garcia-Molina, H.: The eigentrust algorithm for reputation management in p2p networks. In: Proceedings of the 12th International Conference on World Wide Web (WWW), pp. 640–651 (2003)
Runfang, Z., Kai, H.: Powertrust: a robust and scalable reputation system for trusted peer-to-peer computing. IEEE Trans. Parallel Distrib. Syst. 18, 460–473 (2007)
Yang, Z., Tian, J., Dai, Y.: Towards a more accurate availability evaluation. In: Proceedings of the 2006 International Workshop on Networking, Architecture, and Storages (IWNAS), pp. 73–80 (2006)
Acknowledgments
The authors wish to thank Krzysztof Rzadca for his valuable comments. This work has been supported in part by AcRF Tier 1 Grant RG 29/09 for the CrowdStore project.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Trust Modeling
Modeling trust using beta distribution is a very popular approach. In our work, a transaction is binary, i.e., successful or unsuccessful, and we model the transactions between p o and p h as observations of independent Bernoulli trials. In each trial, the success probability, that is, the trust p is modeled by Beta distribution. Equation 4 shows the probability density function of Beta distribution:
α and β are the shape parameters representing numbers of successful and unsuccessful transactions between a pair of peers. We start with α = β = 1, that translate into a complete uncertainty about the distribution of the parameter, modeled by the uniform distribution: Beta(1;1) = U(0;1). After observing s successes in n trials, the posterior density of p is Beta(α + s; β + n − s). Figure 7 shows three examples of beta pdf with different parameters. The curves express the relative likelihood of the probability that the target peer is trustworthy in the future transaction. When α > β (there are more successful transactions), the target peer is trustworthy with a higher probability, otherwise, it is trustworthy with a lower probability.
So trust is modeled as a function and not as a single value. In this way, we can understand various aspects of trust like its expectation, confidence, etc., by studying the function.
The following definition defines the trust function between entities (an individual peer or a group of peers) based on a beta function. By E t , we denote the entity participating in the trust calculation.
Definition 1
(Trust Function) Entity E 1 evaluates entity E 2. From the viewpoint of E 1, S E1,E2 and U E1,E2 represent, respectively, the number of successful transactions and unsuccessful transactions between E 1 and E 2 (S E1,E2 ≥ 0 and U E1,E2 ≥ 0). Trust function T E1,E2(p|S E1,E2, U E1,E2) mapping trust rating p (0 ≤ p ≤ 1) to its probability is defined by:
The expected value of the trust function is equal to:
Rights and permissions
About this article
Cite this article
Liu, X., Datta, A. Contextual Trust Aided Enhancement of Data Availability in Peer-to-Peer Backup Storage Systems. J Netw Syst Manage 20, 200–225 (2012). https://doi.org/10.1007/s10922-011-9198-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10922-011-9198-9