Introduction

Privacy preserves human dignity, provides autonomy, and lowers barriers to communication. Privacy is at risk of erosion when our online activities create a trove of information that can easily be captured, stored, and disseminated at scale through technological means. This risk permeates every facet of online activity, and online learning is no exception. Professor Jim Greer was one of the earliest thought leaders to be concerned about privacy in online learning as well as information privacy in general (Anwar and Greer 2006, 2008a, b, 2009, 2011, 2012; Kettel et al. 2004; Richardson 2005; Anwar et al. 2006; Anwar 2008).

The global e-learning market will reach $325 billion by 2025 (McCue 2018). While 77% of online companies used online learning in 2017 to speed up employees’ training, 49% of students have taken an online course in last 12 months in 2015 (Chernev 2019). From identity credentials to learning activities to transcripts, everything is stored online in the cloud. As a result, online learning environments have become prospective targets of data leakage attacks and the consequence of privacy breaches in online learning is quite significant. The increasing privacy concerns limit user trust on online learning systems as well as their level of disclosure – a prerequisite for personalized learning. Therefore, it has increasingly become urgent to have online learning platforms that support privacy, trust, and personalization.

Privacy is a nebulous concept as the need for privacy is elastic and both privacy and disclosure are desirable under different communicative contexts. Some of the early definitions of privacy such as “right to be let alone” (Warren and Brandeis 1890) are more fitting to physical privacy. With the advancement of Internet and Web technology, the online world has become an essential part of our lives. As a result, online privacy issues started to overwhelm us. Three major privacy theories that we find very relevant to today’s online privacy concerns are: limitation theory, control theory, and contextual integrity theory. This article surveys these theories together with Greer & Anwar’s approach on information privacy in online learning environment.

After surveying the notion of privacy, we discuss the need for information privacy in online learning environment. We highlight the need for suitable privacy policies and survey the privacy mechanisms introduced by Anwar and Greer. Specifically, we seek answers to following research questions:

  1. 1.

    To what extent are privacy, trust, and personalization desired in online learning?

  2. 2.

    How can privacy, trust, and personalization be supported in online learning?

The rest of this article is organized as follows. Section “Definition and Theories” discusses definitions and theories related to privacy, trust, and personalization. Section “Privacy, Trust, and Personalization in Online Learning” highlights the interactions among privacy, trust, and personalization in online learning. Section “Mechanisms for Privacy, Trust, and Personalization” discusses mechanisms for supporting privacy, trust, and personalization employed by Anwar & Greer in online learning. Section “Discussions” offers discussions on open problems and we conclude in “Conclusion”.

Definition and Theories

Privacy

The notion of privacy was expressed in many different ways for myriad of online environments. This article is responsive to the privacy concerns of online learning. Therefore, we deconstruct three prominent theories of privacy in the context of online learning: limitation/restriction theory (Allen 1988), control theory (Altman 1975), and contextual integrity theory(Nissenbaum 2009). The limitation theory of privacy recognizes the need for limiting others from accessing to one’s personal information. The control theory of privacy realizes the need for one’s having control over their information. The main criticism of control theory is that it is an unreasonable expectation to have control over all information, especially when our online presence exposes us to different observers. Contextual integrity theory defines privacy in terms of context-relative informational norms (Nissenbaum 2009). Informational norms have following key parameters: actors (sender, recipient, subject), attributes (types of information), and transmission principles (constraints under which information flows).

Greer’s work covers various aspects of all three major theories. Additionally, his work provides a unique perspective of tying identity and presentation to privacy. In his work, privacy is defined as the user’s capacity to control the conditions under which their identity information will be presented. This notion of privacy is inspired by Irving Goffman’s “presentation of self” (Goffman 1978) theory. Goffman argued that the elements of human interactions are dependent upon time, place, and audience. Time, place, and audience represent an information context and users conceal and reveal information based on the information context, Greer and Anwar argued (Anwar and Greer 2011). The concealing of information is in essence limiting access. The ability to conceal and reveal require users’ control over their information.

Like Nissenbaum (2009), Anwar & Greer also defined contexts and contextual flow of information in online learning environment. They developed mechanisms to enforce contextual boundary of information through identity management. For privacy preservation, Anwar and Greer saw the need of different partial identity information to be presented to different audiences. Through partial identity boundaries, individuals should be able to: (1) exert control over data collection and (2) prevent data collected for one purpose to be used for another purpose.

Trust

Trust is defined as a mental state comprising of expectancy of a specific behavior from a trustee, belief that the expected behavior occurs, and willingness to take risk for that belief (Huang and Nicol 2010). In the context of online learning, this expectancy is from a user (e.g., learner) on another user (e.g., co-learner, instructor, etc.) as well as on the environment (e.g., LMS). For example, Wang (2014) observed that prospective students employ trust in the decision-making process of enrolling in online courses.

Trust is a precondition for self-disclosure. Trust reduces the perceived risks involved in disclosure of sensitive information.Trust determines the relevance or justification of a purpose for seeking data in a given context. Reputation is more of a social notion of trust (Golbeck and Hendler 2004). In our lives, we each maintain a set of reputations for the people we know. Anwar and Greer presented a model for facilitating trust integrating reputation with policies (Anwar and Greer 2011).

Personalization

Bol et al. define personalization as, “the strategic creation, modification, and adaptation of content and distribution to optimize the fit with personal characteristics, interests, preferences, communication styles, and behaviors (Bol et al. 2018).” Online activities generate a large amount of user information that companies use to tailor online services based on individuals’ interests, behaviors, and needs. Personalization is an essential element of today’s data-driven economy. Therefore, companies are more driven to accumulate data and less concerned about protection and proper use of data, putting users at risk of privacy breaches. Kobsa and Schreck have described the risks to privacy posed by personalization (Kobsa and Schreck 2003b).

Privacy, Trust, and Personalization in Online Learning

Anwar observed that the crux of the privacy concerns lies in the fact that a user has inadequate control over the flow (with whom information to be shared), boundary (acceptable usage of personal information), and persistence of information (duration of use) (Anwar 2008). Anwar and Greer further investigated the need for privacy in online learning (Anwar and Greer 2011). We revisit their findings in search of the first research question: To what extent are privacy, trust, and personalization desired in online learning? Privacy promotes safe learning. Therefore, privacy is required in the following popular learning activities: peer-tutoring, peer-reviewing, learning object selection, collaboration, group learning, evaluation, role-playing, and personalization.

Trust is a crucial enabler for meaningful and mutually beneficial interactions that build and sustain collaboration (e.g., collaborative learning). However, in most online learning environments, the (possibly pseudonymous) users are strangers whose interactions are limited to mostly written communications. Identity management (in the form of various degree of anonymity) is one technology-based approach to protect privacy. However, privacy-enhancing identity management (PIM) impedes trustworthiness assessment of an actor.

Anwar and Greer observed that privacy and trust are equally desirable, and one influences another (Anwar and Greer 2012). Privacy promotes safe learning while trust promotes collaboration and healthy competition. As part of the solution to privacy, Anwar and Greer also focused on trust relationships through identity management models. Due to lack of bodily presence of online actors, it is hard to establish trust relationship among them. Table 1 summarizes the observation of Anwar and Greer about the need for privacy and trust in popular learning activities.

Table 1 Online learning activities and associated privacy/trust issues (Anwar and Greer 2011)

Personalization is an essential component of today’s online learning. Because of continuous monitoring of learners for personalized learning, and amassing learner’s data into large databases to generate learner analytics, privacy threats are real.

Personalization of learning objects can increase the motivation and interest of learners (Bates and Wiest 2004). As a result, in recent time, we have witnessed an increasing volume of research and development efforts to offer personalized e-learning. Trust has been identified as a prerequisite (Kobsa and Schreck 2003a) and a consequence of good personalization practice (Karat et al. 2004) . Anwar et al. define key characters of an e-learning environment that offers personalization together with trust and privacy (Anwar et al. 2006).

Mechanisms for Privacy, Trust, and Personalization

In this section, we explore, “How can privacy, trust, and personalization be supported in online learning?” Anwar and Greer proposed mechanisms for privacy preferences (Kettel et al. 2004), identity management (IM) (Anwar and Greer 2008a, 2009, 2012; Richardson 2005), contextual flow of information (Anwar and Greer 2012; Kettel et al. 2004), trust (Anwar and Greer 2006, 2008b, 2011), and personalization (Anwar et al. 2006) to protect the privacy of learners. Identity management provides a form of privacy (the protection of personal information) through users’ anonymity or pseudonymity. The users are allowed to operate multiple identities or can adopt new pseudonymous personas as long as contextual norms are maintained, and the integrity of the reputation systems is not undermined. As a result, they proposed a reliable and trustworthy mechanism for reputation transfer from one partial identity to another. This reputation transfer system prevents linkability of learners’ identities and personas. For the need of concealing, privacy concerns can be mitigated through anonymization. Identity management system can assist users negotiating the calculus of trust and privacy.

Privacy Preferences

Information about learners is diverse and may include a wide range of activities and assessments such as quiz results, assignment marks, submission times, how often and when learner accessed course materials, postings made and read on web-based discussion boards, ratings of posting quality by participants, chat interaction logs, and opinions held about the learner by others. Kettel, Brooks, and Greer raise the question, with deep meaningful information shared about learners, who is protecting the privacy rights and desires of the learners? (Kettel et al. 2004). In a complex, multi-role, multi-user environment, it is hard to impart necessary personal information of users for learning activities such as group building, social navigation, locating appropriate helper, etc. while protecting their privacy. Building around semantic web and web service technologies, they offered a proposed implementation of a privacy filter approach.

To allow users to control their own level of privacy, they designed to control access to the stream of events and information that flow through learning environments. Every time users interact with the learning environment, an event is triggered within the system and passes through the system’s Event Stream. It is assumed that each user has their own personal agent and that users configure their agent with their own individual privacy preferences. The personal agent captures a user’s level of sensitivity to their information and seeks privacy accordingly. The user describes which kinds of information can be passed on to different types of users (e.g. grades to my teachers and interest indicators to my friends), and in which format (e.g. identified by name, alias, or anonymously). However, there is a knowledge gap about how much users know the consequences of decisions they make when they configure their agent.

Identity Management (IM) for Privacy

A proponent of more control for users over their personal information, Richardson and Greer proposed identity management architecture that allows users to decide on a per business basis what personal information is provided (Richardson 2005). The identity management architecture helps users manage personal information and improves a user’s awareness of and access to his or her personal information and help businesses to more easily comply with privacy legislation. Persona and identity are two main components of identity management system. A persona is essentially an identity. However, multiple personas can each use the same personal information from a single identity. A user may create two identities with two set of personal information. However, a user may create two personas with same identity information to maintain separate relationships with two businesses. In essence, identities and personas help create contextual boundaries.

Another important identity management-based privacy solution proposed by Anwar & Greer is role- and relationship-based identity management (RRIM) (Anwar and Greer 2009, 2012). The processes associated with RRIM are described in Table 2.

Table 2 The exemplar tasks and processes associated with role- and relationship-based identity management for privacy-preserving yet accountable online learning activities

Contextual Information Flow for Privacy

Anwar and Greer defined communicative context through roles and relationships (Anwar and Greer 2008a). In online learning, there are well-structured roles such as instructor, grader, teaching assistants, learners, etc. and relationships such as one-to-one (e.g., mentor-mentee), one-to-many (e.g., instructor-class), hierarchical (e.g., instructor-grader) are relatively predictable. In their approach, there are two kind of identities: role-based and relationship-based. A role-based identity hides an actor in the crowd of same roles and a relationship-based identity draws the boundary of communication to specific relationship. Moreover, they assigned guarantor privileges to public roles to sanction foul acting and to facilitate usage control.

Supporting Trust

Reputation, which is a longitudinal social evaluation on a person’s actions, can be used as a measure of trustworthiness of an actor’s future behavior. A good reputation is a return on a long term investment of good behavior. Anwar & Greer proposed a reputation transfer model that would allow reputation transfer among multiple pseudo-identities (e.g. pseudonyms) without letting anyone associate these pseudo-identities (Anwar and Greer 2006, 2008b). As a result, this model facilitates both privacy and trust.

The assumption is that both the transferring and receiving identities are just two pseudo-identities for one entity. A pseudonymous entity can update the reputation of one pseudonym by transferring its reputation from another pseudonym. A guarantor vouches for an entity in two ways: i) responding to the queries about the pseudonym, ii) responding to the entity’s reputation transfer request from one pseudonym to another. Since both the transferring and receiving entities are registered users of a guarantor, any bad acting can be traced and verified. All the communication between an entity and the guarantor takes place using each the other’s public key. Moreover, the integrity of reputation can be checked using the reputation digest.

Supporting Personalization

In the context of online learning, personalization refers to the selection or customization of learning content as well as sequences of learning activities to meet the needs of individual learners (O’Keeffe et al. 2012). One of the primary uses of learning analytics is to support personalization. Greer’s research shows that personalization is possible without compromising personal space of the learners. With a call for proper policy and mechanism (technical framework), Anwar et al. (2006) made recommendations and implemented a privacy-enhanced learning environment that also supports content personalization through pseudonymization of learner identifiers. For personalization, a pseudonym allows correlating datasets from one learner, however, the system allows anonymization for total non-disclosure and can de-psrudonymize the learner if accountability is warranted.

Learning analytics are utilized to understand and improve learning experience of the learners. Pardo and Siemens (2014) observe that in the context of learning analytics, privacy is tightly connected with trust and transparency. Learning Analytics can be utilized to create a personalized learning environment that can provide recommendations to learners concerning peers and the learning objects. Potts et al. (2018) proposed an open-source course-level recommendation platform to provide reciprocal peer recommendation for learning purposes. Troussas et al. developed a multi-module model which identifies learners’ cognitive states and predicts their behavior in order to provide learning object recommendations, curriculum improvement supporting personalized instruction (Troussas et al. 2020).

Table 3 highlights some privacy features of a learning environment that can ease the issue of trusting the learning environment.

Table 3 Recommended features of a privacy-protecting personalized online learning environment

Discussions

With the wide adoption and acceptance of online learning, privacy has become a critical issue. However, the need for privacy cannot be described in absolute term. Privacy is strongly connected with the issues of trust and the value of personalization. Therefore, any policy or mechanism for privacy also needs to take trust and personalization into consideration.

The notion of context is ambiguous and it is onus of users to understand the context

Privacy, trust, and personalization– all depends on context. Therefore, context is an essential element in privacy protection. However, it is hard to operationalize context. It also creates cognitive burden on users to maintain contextual boundaries as they share information. The environment has to facilitate contextual awareness or provide suggestions of context to users. The online learning environment needs to help users maintain contextual flow of information.

Privacy matters to all actors

Privacy solutions should not only focus on learners. Privacy matters to all the actors in online learning (instructors, administrators, tutors, etc.). Besides, with the increased participatory and contributory nature of web, there is no fixed consumer or producer of content in social learning portals or communities of practices. Therefore privacy solutions have to be all inclusive to all actors in the system.

A partial identity of a user needs to have a temporal dimension

A Partial identity or an identifier needs to have a temporal dimension. There has to be well-defined lifetime of identities and any information anchored on identities also need to expire.

Balancing of privacy and accountability is needed

In online learning, privacy is important along with accountability. We need to preserve privacy but support community building for which accountability is very important.

Privacy in the age of learning analytics

Research shows that users are willing to share personal information for benefits as longs as they trust the context for sharing and have control over their data. Slade et al. (2019), (Prinsloo and Slade 2016) found that students greatly trust their university to use their data appropriately and ethically. On the other hand, a Pew research study found that most internet users (86%) would like to be anonymous and a significant 59% of them have taken steps to avoid observations by people, organizations, or governments (Rainie et al. 2013). Ivanova et al. (2019) underlines the need for effective mechanisms and digital competence for a responsible use and sharing of own and others private data in personal learning environments (PLEs) such as informal Web 2.0 / Social Media PLEs, mobile PLEs, ePortfolio-based PLEs, etc.

Ethical perspectives of privacy

Pardo and Siemens (2014) discussed ethical issues arising from capture and use of personal information in online learning. Drachsler and Greller (2016) discuss the issues of privacy and ethical use of learning analytics. Although the scope of this paper is limited to privacy and its relationship with personalization and trust, the ethical side of privacy is of significance and that should guide the practices of data collection and dissemination. The developers of online learning systems need to grapple with ethical questions of surveilling users and using and disseminating their information without allowing the users to make an informed choice. Lately, the privacy-by-design principles have received wide recognition, however, Verhagen et al. (2016) observes that clashes between values are unavoidable and collaboration between ethicist and software designers at early stage of design is required for the implementation of privacy-by-design.

Legal perspectives of privacy

In recent times, privacy laws have emerged to protect user privacy. In the United States, there are several privacy laws to address different aspects of privacy. For example, the Health Insurance Portability and Accountability Act (HIPAA) (https://www.govinfo.gov/content/pkg/PLAW-104publ191/pdf/PLAW-104publ191.pdf) protects personal health information. The Fair and Accurate Credit Transaction Act (FACTA) protects against theft of credit information. The Children’s Online Privacy Protection Act (COPPA) protects the privacy of those under the age of 13. US Department of Education (2014) recommends checking if student information used in online educational services is protected by Family Educational Rights and Privacy Act (FERPA). On the other hand, the core of Europe’s digital privacy legislation is General Data Protection Regulation (GDPR) (Voigt and Von dem Bussche 2017) for data protection and privacy in the European Union and the European Economic Area. These privacy laws provide some framework for protecting privacy, but is not adequate to ensure control or transparency. For example, Schaub et al. (2017) observed that regulatory compliance produces long and complex privacy policies but not transparency to users. Greer and colleagues strived for developing technical means to empower users and provide transparency for privacy.

Privacy in the age of data

Greer’s more recent works (e.g., Books et al. 2014) were focused on data-assisted approaches to develop intelligent technology-enhanced learning environments where learner activities are modeled with actors (learners, instructors, instruction assistants), artefacts (videos, tests, webpages), interaction behaviors (watching, answering, clicking). Increasingly, data driven approaches with disruptive technologies such as various mobile and virtual reality technologies will be prevalent in online learning. As the new technologies provide novel ways to make learning personalized, informal, and life-long; learners need to make decisions on the trade-offs of sharing vs. withholding personal information. As a result, the future success of online learning will lie at the intersections of privacy, trust, and personalization.

In summary, privacy in online learning is an interesting microcosm of the broader issues of privacy in online communities. As a result, all the issues discussed here is very relevant to online privacy in general. In this age of information and data analytics, users gain some short-term benefits from sharing personal information without fully grasping the cost of long-term privacy. This paper surveyed the work of Greer and others on building privacy-preserving online learning system that supports contextual boundary regulation by the users for information disclosure and dissemination. The future of the information society is strongly tied to its citizen’s ability in maintaining the contextual boundary of their personal information through privacy-protecting technologies, policies, ethics, and laws.

Conclusion

Privacy provides protection from misuse of information, it is a prerequisite to any relationship like collaboration, and privacy encourages free thinking. As a result, privacy is critical for online learning. Because learning is one of the fastest growing online activity, the breaches of privacy can significantly damage online learning platform. Due to lack of bodily presence, trust is not only harder to establish online, but also misplaced trust have severe consequences. Privacy can help build trust. Realizing the need for privacy and trust and the relationship thereof, Greer and his coauthors proposed various methods to provide privacy and trust in online learning. In addition to supporting theory of limitation, control, and context, Greer et al. utilized a unique aspect of identity management for privacy. In the future, the protection of privacy will increasingly be challenged by powerful technologies to capture, store, and disseminate data. The online service providers need to maintain a strong privacy posture as well as take on a larger role in supporting privacy in their respective environments.