Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

In recent years microblogging platforms such as Twitter have gained popularities. Features that are supported by the microblogging platforms such as message posts, replies, comments, retweets, and hashtags are known to explicitly define potential user interactions on these platforms. While such interaction-defining features have a direct and immediate impact on the user behaviors, different microblogging platforms also support features that are less visible, but that still confine and shape user interactions. Such features include character limitations for posting a message (e.g., Twitter), supports for anonymity (e.g., Kik and Yik Yak), and the use of geo-data (e.g., NearbyFeed and Yik Yak). Yet not enough studies have investigated how these different features do indeed impact user interactions. In this work, we explore microblogging user behaviors on an anonymous social networking platform, Yik Yak Footnote 1.

Yik Yak is a location-based, anonymous social media platform which is serviced primarily on smartphones. It was first introduced in November 2013, and gained an instant popularity amongst college students [9]. Yik Yak groups users based on their geographical proximities. While Yik Yak mobile users are allowed to ‘peek’ into different locations (colleges and cities) to view Yik Yak posts, the platform allows only the users belonging to the same local messaging group to post anonymous messages, “yaks” on the local messaging board. Yik Yak users can post yaks, leave replies to the specific yak postings, and either up-vote or down-vote yaks and replies.

In recent years, Yik Yak, an anonymous microblogging/social networking platform has instigated concerns for cyber-bullying and hate crimes, and sometimes caused spontaneous repulsions from educators and parents (e.g., [1, 7, 12]). Citing several anonymous threats including a mass shooting threat made on Yik Yak, for instance, Alblow rightfully raises his concerns and states that “Yik Yak is most dangerous app” he has ever seen [1]. Threats of sex and hate crimes have been reported numerous times. Racist, xenophobic, homophobic and misogynist yaks have created controversy at many US-based colleges [5].

Compared to a large body of research done on Twitter, another online social networking platform created in 2006, only a handful of research works done on Yik Yak exist. McKenzie et al. conducted a thematic difference analysis using a latent Dirichlet allocation (LDA) technique to compare Twitter and Yik Yak datasets [6]. Results of qualitative analyses on yak data have been reported also [2, 8, 11]. Saveski et al. investigated the usage patterns of vulgarity, and reported that yaks containing offensive and vulgar terms are more likely to be down-voted [11]. Black et al. developed a coding scheme which consists of 11 codes to conduct a thematic analysis on over 4000 yaks, and reported that most postings were benign, arguing “whether Yik Yak creates and promulgates a negative culture remains debatable [2].”

In this research, we aim to learn how Yik Yak users communicate on the platform, understand the role of anonymity in computer-mediated communication, and inform future communication-technology designs.

2 Methods

2.1 Data Collection

During Spring Semester 2016, we collected yaks from two US-based Universities (Virginia State University and Louisiana State University) using a web-based yak scraping tool developed in-house. We used Selenium Footnote 2, a web automation framework to create a data collection tool specifically designed to extract yaks from the Yik Yak website, and used the tool to collect the yak data periodically for 4 months. Since, unlike Twitter, Yik Yak has yet to provide users its own application programming interface (API) sets for automated yak data extraction, we decided to use the Web automation solution to mimic manual data scraping to collect yaks as well as replies/comments added to the original posts.

Unlike Black et al. [2] who collected the yak data by manually peeking into different locations and then screen-capturing iOS Yik Yak application screens, we chose to automate the data collection process. However, as a consequence, we had to restrict the data collection locations to Virginia State University and Louisiana State University because the Yik Yak web interface did not support “peeking”, and our yak scraping tool only worked on PCs and Macs, but not on mobile devices.

During the time of data collection, Yik Yak changed their web document object model (DOM) more than once, and we had to update our tool to reflect the changes made on the Yik Yak website.

Our yak scraping tool collected 100 yaks, the maximum number of yaks displayed on the Yik Yak website at any given moment, per each run, and stored yak posts, replies, up-vote and down-vote numbers for both yaks and replies into a text file. During the four-month period, we used the tool to generate 150 daily yak collection files:

  • LSU yak collection dates: 3/28/2016–7/27/2016 (4 months): 137 text files

  • VSU yak collection dates: 3/18/2016–4/19/2016 (13 days): 13 text files

We then merged these files from each university into a single collection file, removing empty lines, duplicated yaks and replies using Python scripts. We kept emoticons and picture links. The final merged LSU yak file contained 4,670 yak threads, and the merged VSU yak file held 767 yak threads.

Our findings in this paper draw from both the qualitative analysis and the thread structure analysis conducted on the yak dataset we collected over a four-month period at two locations.

2.2 Coding: Two-Layered Hierarchical Coding Scheme

The data analysis was done in multiple iterations. During the initial analysis, the third author randomly selected 200 yaks from each location (400 yaks in total) and used the dataset to develop an initial coding scheme. The third author went through the posted messages and conducted open coding [3, 10] to categorize various kinds of messages posted on the platform. Our initial coding scheme was built on the yak coding scheme developed by Black et al. [2], but we further refined the coding scheme as we analyzed our yak data.

After developing the first round coding scheme, the first and fourth authors again refined the coding categories and finalized the coding scheme by conducting open and axial coding [3, 10] on the 200 most up-voted and the 200 most down-voted yaks from each data collection site (800 yaks in total). Then the researchers revisited the dataset and annotated the collected yaks with the finalized thematic codes to develop fuller understandings of various kinds of message posting behaviors. While many yaks were expressed in social networking lingo and slangs foreign to the first author, the fourth author who is a 24-year old undergraduate student was able to translate most of the yaks into common English.

The final coding scheme consisted of 20 top-level categories and 51 second-level categories. The authors used affinity diagramming [4] to organize and visualize the coded data.

2.3 Thread Structure Analysis

For clarification, we first define the terms used in the analysis:

  • Yak: a social media message posted on Yik Yak

  • Thread: a combined group of messages containing an original yak post and the replies to it

  • Thread Length: number of replies +1 (original yak post)

  • Frequency: a count of threads that have the same length

Fig. 1.
figure 1

Yik Yak thread: (a) an example Yik Yak thread on a smartphone GUI; (b) extracted users of posted yaks and replies; and (c) metrics for thread structures.

As a means to understand the yak posting behaviors of users, we analyzed the structures of yak threads (Fig. 1). The first step was to examine the frequency of threads by grouping them based on their lengths, and then counting the number of threads in each length group. It provided us a general idea of the size of communications in the Yik Yak community. The second step was to identify the number of unique participants in the threads. Understanding the average number of unique participants in threads, which have different lengths between 1 (i.e., the original poster—OP—only posted a single yak) and 102 (i.e., many communications by multiple participants), may give us insights into whether a thread was dominated by just a handful of participating users, or by many. Based on this analysis, we have developed a metric, Participant Diversity, as shown below:

Participant Diversity = number of unique participants in a thread/thread length

This metric is an indicator which shows how many diverse users are participating in a thread relative to the number of yak communications occurring in the thread (Fig. 1(c)).

3 Findings

3.1 Primary Codes

This section describes 20 primary codes, lists our rationales for creating the codes, and presents illustrative examples. In addition, by providing and discussing sample yaks, we highlight Yik Yak usage patterns that weren’t previously reported.

Sex. Black et al. used “dating/sex/sexuality” to code posts that “explicitly mention sex, sexual fantasies/urges, and romantic relationships with a partner [2].” However, while analyzing our yak data, we noticed that many yaks mention friendships and relationships without showing clear romantic intent. For that reason, we dedicated “sex” to denote posts that either explicitly mention sex and sexual intent or at least show unequivocal sexual innuendo. For instance, the fourth author identified “any ladies want their cat ate?” not as “consuming cat meat”, but as “finding casual sex partners.” Fourteen percent of our yak data belonged to this category.

Friendship/Partnership/Dating. There were as many yaks that mentioned relationships without direct reference to sex as there were yaks that mentioned sexual relationships. For instance, we coded “I want company while i do homework...ugh” and “I need to find some people to play online with on the Xbox one because I have no friends” under this category. Sometimes the distinction between this category and Sex wasn’t as clear. We thought, for example, “Roommate’s leaving this weekend, any ladies want to come through later” could possibly be implying sex. Yet when we did not see any direct evidence to put yaks under Sex, we put them under this category.

Threats/Crimes. Even though threats of sex and hate crimes made on Yik Yak including a mass shooting threat and revenge pornography circulations have been previously reported [9], Black et al.’s coding scheme did not include a code for categorizing such crimes. We devised this category and a Hate Speech category to mark posts that show intent to inflict harm on others or posts that mention illegal/criminal activities with the exception of drug consumption. Drug related posts were coded under Drug. Although we did not expect/hope to see any posts coded under this category in our dataset, we had to create this code because we believed that even one criminal activity related post was too critical to overlook.

In our dataset, we ended up coding three posts under this category. The first post described a rape incident or a possible rape attempt. A user posted “I liked this guy and thought he was so cool until he forced me to have sex” and received three down-votes. Another post mentioned bartering sex for homework. The last one stated, “If I was a serial killer I’d pretend to be willing to suck dick on yik yak and then shoot you when we meet up. Or I’d just give you my AIDS.” While! we hoped this statement was a hypothetical one which was meant to warn users not to engage in certain activities, we coded the post under this category.

Hate Speech. We used this code to categorize posts that include racist, sexist, and xenophobic statements or comments. While some number of yaks that we collected from Virginia State University, an HBCU (Historically Black College and University), contained variations of an ethnic slur directed at African Americans, the fourth author who is himself African American mentioned that the use of this term in some of the posts should not be seen as racist. For instance, we coded “Thou Shall Not Throw Shade If Thou Can Not Throw Hands .. - Niggalations 17:38” under Humor/Joke/Rapping instead of Hate speech.

Drug. This category includes posts that reference illegal drugs. Unlike Black et al. [2] who used “Drug/Alcohol” to code both illegal drugs and legal intoxicants such as alcohol. We dedicated this category to identify posts that mention illegal drugs such as marijuana, and coded posts that mention alcohol consumption under Food, Sports and Life.

Profanity/Obscenity. Some posts included obscenities and profanities. However, most yaks that contained generally considered offensive and vulgar terms such as ‘fuck’ and ‘shit’ used these terms to decorate sentences and convey nuances instead of communicate literal meanings of the words. In such cases, we coded these posts under other categories that are contextually more appropriate. For instance, the post “When you’re new to Yik Yak and are wondering what the fuck you’ve gotten yourself into...” was coded under Yik Yak Related instead of Profanity/Obscenity, while “Yall bitches aint shit !” got coded under this category.

Information Sharing. To the authors’ surprise, a good number of yak posts contained informational contents. For instance, a yak collected on 03/18/2016 at VSU was used to announce an official school-wide event (Town Hall meeting at 3:30 in Foster), and a yak collected at LSU on 4/11/2016 was used to inform the availability of exam grades (PSA: Hopkins exam 3 grades are posted). Although we do not have direct evidence to show why users decided to use Yik Yak to post such messages instead of using non-anonymous platforms (e.g., Twitter or Facebook), nor do we know if users posted the same messages on multiple platforms including non-anonymous ones, it would be plausible to conjecture that this kind of yak posting behavior is motivated by the location-based posting mechanism Yik Yak provides. The platform is both anonymous and location-based, and the location-based postings might be a convenient way to guarantee that the messages would be seen by the target community regardless of the anonymity. In such cases, Yik Yak is not so different from the traditional flyer boards that can be easily found on most US University campuses.

Information Seeking. While analyzing our data, we also found that a greater number of yaks was used to query information (9%) rather than announce or broadcast informational messages (2%). This category included posts such as “What time cookout close?” or “need a job that pays well :( the struggle is real. If y’all know anything that’s hiring please let me know.”

As we just mentioned, questions such as “is there a good hair salon for natural hair around here?” can be viewed as a post is motivated by location-based nature of Yik Yak rather than by its anonymity. However, we also think that there were questions that users might deem more appropriate to ask anonymously. While the question, “Easiest art gen ed for engineer majors?” is very location relevant, users might not want to ask such question on non-anonymous platforms since it might give other people an impression about the message poster being a person who only wants to take the easiest class, and therefore academically less motivated.

Identity Sharing. A common perception about using other people’s names on anonymous sites, whether names are real or fictitious, is that the use of these names is to bully other people as Black et al. previously noted [2]. Yet, while coding our data, we noticed that some Yik Yak users chose to reveal their own identities or left trails that could lead someone to identify them. Many yaks included either inquiries about other users’ online IDs for different social media platforms/instant messengers, or included statements (either in yaks or in replies) revealing their own online IDs. Multiple yaks asked other users’ KikFootnote 3 IDs, and users voluntarily shared what seemed to be their Kik IDs on Yik Yak. For instance, as a response to a post, “GF applications? Kik me! (Kik inside)” one user replied “My kik is ******”—the actual ID used in this reply is anonymized by the researchers. We have no way of knowing if what looked like a Kik ID in this reply was one’s own, somebody else’s, real or fake. We have no way of knowing whether the original poster of the yak had actually intended to solicit other users Kik IDs or just posted the yak as a joke. However, the fact that some Yik Yak users do mention and/or are willing to share what look like online IDs, on Yik Yak, an anonymous platform, is noteworthy.

Seeing users of one anonymous platform (Yik Yak) sharing their online identities from another, more or less anonymous platform was quite unexpected. And realizing that the IDs shared on Yik Yak were mostly for one-on-one chat/video chat applications such as Snap Chat, Glide and Kik was even more surprising. While any social media platform accounts can be created with pseudo-identities, users sharing non-anonymous platform IDs on an anonymous platform is a type of behavior that warrants further investigation.

In addition, based on the evidences we saw on multiple yaks, we think it is safe to assume that at least some number of Yik Yak users do use the application to connect to others online, and possibly, in some cases, to meet with other users offline. Since Yik Yak users are all located in close proximity, it is more possible for Yik Yak users, as compared to other online platform users, to take online interactions offline, and meet with other users in person. This is also a type of behavior that requires further exploration.

Personal Experience. We coded posts that describe personal experiences or tell personal stories under this category.

Expressing Emotion. Posts that express one’s current state of mind and feelings such as anger, loneliness, stress, boredom, despair, and gratitude were coded under this category.

Campus Life. This is another code that we borrowed from Black et al. [2]. Black et al. used this code to denote school-related postings. However, unlike Black et al. who used to the code, Greek to mark fraternity/sorority related postings, we included course-related, grade-related, and campus-life related postings as well as fraternity/sorority related postings under this category.

Food, Sports and Life. We used this category to capture postings that mention events, objects and sentiments related to everyday lives except school-related ones.

Pop Culture. Posts that mention issues, events, products, or ideas related to pop culture.

Humor, Joke and Rapping. This category included explicit jokes and funny stories. We had a post that sounded very much like “rapping.” The post said “We living two to a dorm, ain’t out of the norm, got noodles and oddles, AKAs and Poodles, swipes at the door, but wait there’s more ... ,” and we decided to categorize the post under this category. Hence we named this category Humor, Joke & Rapping.

Yik Yak Related. Some number of posts directly mentioned Yik Yak. Soliciting up-votes was the most common in this Yik Yak related. For instance, we coded posts such as “Story time! 10 upvotes and I’ll post story #4!!” under this category.

Religious Statements. Biblical and religious posts were coded under this category.

Political Statements. Yaks that include political statements or references to any political issues and events were put under this code. Activists’ statements such as anti-racism, anti-sexism, feminist, anti-drug and remarks that showed political consciousness were also coded under this category.

Fig. 2.
figure 2

Relative frequency of primary codes

Announcement. We borrowed this category from Black et al. They defined this category to include “posts that are making a statement or imparting information or wisdom. This category included posts that were not able to be contextually understood or otherwise categorized [2]”.

Not Codeable. Some yaks were not at all decipherable. Whenever we could not understand the meanings of yaks, we put them under this category instead of Announcement.

As shown in Fig. 2, Announcement was the largest code covering 21% of collected yaks, followed by Sex (14%), Expressing Emotion (13%), Friendship/Partnership/Dating (12%), Information Seeking (9%), Campus Life (8%) and Trading (5%). The remaining 14 categories consisted of less than 20% of the data. (1.5% of yaks belonged to Drug, Threats/Crimes and Hate Speech were 0.75 % and 0.25% respectively.)

3.2 Thread Structure Analysis

Figure 3 illustrates the Power-Law distribution between lengths of threads and their frequencies, which is not unusual in any online communities. For example, there were 1,915 cases of threads that had length = 1 in our dataset, which means that only the original poster (OP) posted a yak and did not receive any replies. However, there were only 2 threads which had a large length of 102.

Fig. 3.
figure 3

Frequency of threads by thread lengths

In social media platforms and online communities, having diverse participants in discussion threads might be essential for the sustainability of the community, considering that new ideas and information can be shared in the community from contributions of different people with a broad range of ideas. To understand such user participation behavior in Yik Yak threads, we computed the average number of unique participants, who had been identified from threads with varying lengths as shown in Fig. 4. On average, there were 4.52 unique participants in discussion threads, and the standard deviation was 1.14. In the case of missing values (e.g., there were no threads with length 63), we simply used a mean value between existing values (i.e., average participants in threads with lengths 62 and 64) to connect the graph line.

Fig. 4.
figure 4

Average number of participants by thread lengths

Figure 5 illustrates average Participant Relevant Diversity by lengths of threads. It conveys how many unique participants contributed in a yak thread relative to the number of yaks and replies in the thread (See Sect. 2.3). The ‘elbow’ of the graph is found around Thread Length = 13, and the graph is continuously decreasing except for slight ripples around Thread Lengths 45 and 49. In a sense, this graph is another interpretation showing that the number of unique participants in yak threads is not increasing as much as the length of threads increases. Thus, the finding is that long yak threads are the results of frequent communication among a small group of participants in the thread in general.

Fig. 5.
figure 5

Average Participant Relevant Diversity by thread lengths

4 Discussion and Conclusion

A few points stood out when we analyzed our dataset. “Sex” and “Friendship /Partnership/Dating,” two codes that we used to mark yaks that mention relationships, both sexual or non-sexual, made up more than one quarter of our yak data. Some users posted yaks that included very private, sexual statements. Some posted yaks to seek such relationships. Some posted yaks to find people to just hang around with. Another 13% of the yaks were marked as Expressing Emotion, under which yaks that expressed and revealed very personal feelings and emotions were coded. Some even shared their own personal stories (i.e., Personal Experience). By looking at these yaks, we could not help but think that maybe these users, college students, are in need of connecting to other people, real people.

Conforming to common preconception, there existed yaks that included offensive and vulgar language as well as the ones that mention and describe criminal activities. Yet these were only a fraction of the yaks we analyzed. We do not despise the invention of paper even though paper has been used to print pornographic materials. Following the same logic, we believe that misusage of Yik Yak should not be the reason to stigmatize the platform nor the users. As we have seen in our dataset, many yaks were used to deliver informational contents, and in some cases, users might feel more comfortable posting their messages anonymously regardless of the language used in the post.

From the point of structural analysis, we found that 4.52 participants on average contributed to threads that had lengths between 1 and 102. This tells us that even for long threads (e.g., length = 102), only a handful of people (e.g., 4) participated in the thread by communicating with each other using multiple yaks and replies. Although the idea might be far-fetched, we surmise that this phenomenon might be telling us that the most effective size of a voluntary and interest-based online gathering, where its members can communicate with each other comfortably, is approximately 4–5.

Limitations in this study include the following factors:

  • (1) The data collection was conducted only from two locations—VSU and LSU. Although we collected yak data over a four-month period, and gathered over 5,000 yaks, we cannot say our dataset represents the entire Yik Yak community. This limits us from building a general understanding of user behaviors occurring in the broader Yik Yak community.

  • (2) In addition to (1), we ended up coding only a small portion of the collected data (800 coded vs. 5,437 collected). This further limits our ability to understand overall Yik Yak users’ behaviors.

  • (3) In some cases, we had to try to infer the meanings of the postings. Many yak messages included slang, emoticons and pictures. Some of these yaks were not easily understood by the authors. In order to code such texts, we consulted just one undergraduate student, the fourth author of this paper. Therefore our coding depended on one person’s interpretation of the yaks, and we weren’t able to measure inter-rater reliability (There was only one rater).

  • (4) Each code was applied to yaks in a mutually exclusive manner. That is, when we assigned a code to one yak, we were not able to add second and third codes. However, in many cases, putting a yak into just one category seemed limiting. If we had chosen to allow applying multiple codes to a single yak—as if we are to assigning tags to yaks instead of acting as if we were putting yaks into different folders—the results might have been drastically different.