1 Introduction

A dominant stress factor for students today is being bullied at school. Bullying is defined as “any repeated negative activity or aggression intended to harm or bother someone who is perceived by peers as being less physically or psychologically powerful than the aggressor(s)” [22, p. 9]. Reports vary in the amount of bullying that takes place in the schools, in part because the rate of bullying varies by age and type of bullying, as well as by culture and subgroup. On an annual basis, it is estimated that somewhere between 20 % and 56 % of young people are involved in bullying [12], with involvement defined in terms of the roles of perpetrator, victim, or both (bully-victims). The most prevalent types of bullying, in descending order, are name calling, teasing, rumor-spreading, physical incidents, purposeful isolation, threats, stealing personal belongings, and sexual harassment [16]. Verbal bullying is more prevalent than physical aggression, and cyberbullying is the least prevalent [15]. Middle school children are more likely to be involved in bullying than are high school children [25], and certain subgroups are particularly at risk for bullying. For example, in one study 60 % of LGBT (Lesbian, Gay, Bisexual, and Transgender) youth reported victimization in a 30 day period compared with 28.8 % of heterosexual and cisgender youth [11].

The effects of bullying for all parties involved are devastating. Involvement in bullying is correlated with poor mental and physical health, including a higher incidence of psychosomatic illnesses [8]. Peer victimization is associated with high levels of stress and altered cortisol levels in stress tests [18]. Victims of bullying also have higher rates of absenteeism and receive lower grades than those who are not bullied [16]. Most troubling is the correlation between victims of bullying and depression, especially depression leading to suicidal ideation and behaviors, as well as other forms of self-harm [9, 12]. Suicidal ideation is reported for bullies as well, especially bully-victims [17]. Moreover, longitudinal research shows that bullies are at high risk for encounters with the law [23], and children who bully have trouble with relationships through adolescence into adulthood. Bullying can have lasting consequences not only for bullies and for victims [2, 16] but also for entire communities. Of those children who have perpetrated school shootings, 71 % were chronically bullied in their schools [29]. For all these reasons, bullying is considered a serious public health risk that must be addressed through interventions in schools [4].

Until fairly recently the prevention and intervention literature on bullying was relatively sparse [16]. In the last twenty years, a number of intervention studies have examined school programs, and the general consensus is that whole-school approaches are the best evidence-based methods for reducing school bullying [32]. Whole-school intervention programs generally include an anti-bullying policy, school-wide awareness raising, curriculum activities, a plan to deal with reported cases of bullying, and increased monitoring. Despite the international recognition of whole-school intervention programs, these programs reduce school bullying by only 20–30 %, according to a recent meta-analysis of cross-cultural studies on whole-school interventions [7]. Obviously, these programs leave much room for improvement.

One interesting finding in the aforementioned meta-analysis is that playground supervision is strongly related to program effectiveness. The need for schools to increase monitoring of students to prevent bullying has been widely recognized [32]. Most bullying takes place during unsupervised periods and in unsupervised areas at school [3]. According to a 2011 report from the U.S. Department of Justice on school crime [27], close to 50 % of bullying took place in the school hallway/stairwell, 22 % outside on the school ground, and 12 % in the bathroom/locker rooms. Observational studies have shown, however, that bullying is not always entirely covert. Peers are present in 85 % of bullying episodes, but peers rarely intervene to stop the bullying [26]. Some researchers have suggested that peers are pressured to be involved in the bullying for fear they will become victims themselves [26]. Moreover, Peers also rarely report the bullying after the episode [32].

Research also shows that increased monitoring is needed in the classroom. In the 2011 report cited above, approximately 33 % of bullying at school took place in the classroom, making it the second most frequent location for bullying [27]. As Atlas and Pepler have observed [3], when bullying occurs in the classroom it frequently goes undetected by teachers, and even when it is detected, teachers often fail to intervene.

Given this need for more effective monitoring, we propose a school-wide monitoring system that combines state-of-the-art machine learning systems with off-the-shelf technologies to detect bullying episodes. Some of the technologies utilized in our proposed system include smart ID badges, wearables with heartrate sensors for at-risk students, and surveillance cameras. A multimodal machine learning system located in the cloud processes the data produced by these technologies and alerts teachers and staff via mobile devices (or vibrating smart watches worn by teachers in the classroom) to potential bullying incidents.

Since each alert produced by the system would be logged and videos tagged, teachers and staff would be incentivized to implement interventions when bullying is detected. Moreover, reviews of the system logs and videos of detected bullying would allow school personnel to review their methods for handling bullying by providing more information about the locations, causes, and actors involved in bullying as well as teacher/staff response rates. In addition, false positives could be marked and fed back to the system for relearning and continuous detection improvement.

2 System Design

The basic idea of our bullying detection/alert system is illustrated in Fig. 1. This system combines smart school badges, wearables with heartrate (HR) monitors, surveillance cameras, multimodal machine learning, cloud computing, and mobile devices and is intended to tag videos and alert staff when bullying is detected.Footnote 1

Fig. 1.
figure 1

Bullying detection/alert system

As illustrated in Fig. 2, the system identifies potential bullying in three ways: (1) by tracking and assessing the proximity of known bullies to known students at risk for bullying; (2) by monitoring stress levels of students via HR analysis; and (3) by recognizing actions, emotions, and crowd formations associated with bullying. The methods in system 1 focus on students who are known to have trouble with bullying. The methods in system 2 can specifically monitor at risk students or all students. The methods in system 3 are aimed at detecting any incident of bullying and involve the integration of several machine learning systems. Although some of these component systems could be employed alone for monitoring bullying (the tracking system, for instance), our vision is to combine all three systems into a larger bullying detection/alert system.

Fig. 2.
figure 2

Three general systems that together will detect bullying

2.1 Tracking System

A tracking system is necessary for monitoring the locations of known bullies and students at risk for bullying. A known bully in a restroom, for example, with a known student at risk might be reason enough to issue an alert. As noted in Fig. 2, three methods could be employed (or combined) for tracking students. Two methods depend on surveillance cameras and one on smart chips in student IDs.

The two tracking methods that utilize video images make use of state-of-the-art face tracking and person re-identification algorithms. Face tracking combines face recognition/detection methods with a tracking mechanism and is a non-intrusive natural method for tracking people. Some recently proposed algorithms that might form the foundation of our face tracking system include [14, 31]. The basic idea of this system would be to track students through school buildings by recognizing their faces as they move from location to location. This technology would work best in close quarters where multiple cameras are available.

Person re-identification is defined as the task of recognizing a given person across any number of views (some non-overlapping) in a distributed network of cameras. Person re-identification is not dependent on face recognition/detection or other biometric markers, such as gait, but rather tracks the dominant colors in clothing and would be the technology of choice for tracking students across school grounds. Person re-identification would probably not work well, however, in schools requiring uniforms.

Person re-identification is challenging because the system must correctly identify the same person despite changes in illumination, pose, background, occlusions, and variability in camera resolutions and viewpoints. Because many surveillance cameras have low resolutions and cover large areas, an image of a student might be only a small number of pixels high by a small number of pixels wide. Despite these challenges, person re-identification is a fairly mature technology, with some of the most powerful algorithms having been developed in the last couple of years [19, 21, 28].

Another method for tracking students would be to utilize student IDs containing devices offering some means, such as radio frequency identification (RFID), for tracking people in real-time. Unlike person re-identification and face tracking, which requires surveillance cameras, smart IDs would be able to provide the location of students even in areas where cameras are socially prohibited, such as in restrooms, which, as noted in the introduction, are the second most common location for bullying in the schools [27]. Unfortunately, this technology is still in its infancy, and commercial systems that track items and persons are extremely expensive. Moreover, IDs are easily lost, misplaced, and left behind in bags and other belongings.

Despite the advantages smart IDs would offer, they are not a requirement. Depending on the saturation of surveillance cameras, face recognition and re-identification systems could also be employed to log students as they enter and leave school restrooms and other locations where cameras are prohibited, thereby reducing the need for this method of tracking.

2.2 Stress Monitoring System

Bullying produces stress in all parties involved: victims, bullies, and bystanders [3]. Since HR is correlated with stress and measures it reliably [8], it is important to detect it. There are two methods our system could use for detecting HR: surveillance cameras and HR sensors hidden in the clothing of students at risk for bullying.

The least obtrusive method for monitoring HR is through video magnification of skin pigmentation [24, 31], which is illustrated in Fig. 3. Video magnification reveals temporal variations that are impossible to detect with the naked eye. In [31] the authors propose an elegant realtime method for measuring pulse by magnifying the subtle variations in skin pigmentation as blood flows through the skin. Video magnification can also measure breathing rate by revealing low-amplitude motions [31], which might also be useful for detecting stress and anxiety. With video magnification it would be possible to detect the stress levels of any number of students.

Fig. 3.
figure 3

Illustration of color amplification revealing blood flow as the heart beats

One way to handle locations where surveillance camera are unable to detect pulse rates is to have students at risk for bullying wear clothing that place HR monitors in proximity to an area of the body where HR can best be detected yet remain discreet. HR is easily detected in the groin area, the temple area, and on the chest [5, 13]. As illustrated in Fig. 4, we are designing kits to hide off-the-shelf HR chest monitors in upper body foundation garments. Using any HR monitor with Bluetooth capability and an API, we could write software that would send HR information to our proposed system via the child’s smart phone’s internet connection.

Fig. 4.
figure 4

Steps for concealing the DualTRNr monitor in teen’s upper foundation clothing

Figure 4 demonstrates how the DualTRNr HR monitor (using a kit containing one Velcro fastener and two snappable soft pads) might be concealed in upper foundation clothing for females (ages 8–17). The same idea would apply to boys (ages 5–17) using t-shirts. Instructions for using this kit for girls appear in rows 1–3 of Fig. 4. In row 1 the DualTRNr HR sensor is unsnapped from the original strap. In row 2, one side of the Velcro strip is applied to the front of the HR monitor, and the snaps on the HR monitor are covered with the soft pads (since that side will be pressing into the skin). In row 3 the other side of the Velcro strip is attached to the underside of the garment. The DualTRNr HR sensor is then attached so that it fits next to the skin and registers the student’s HR. Ideally, the HR monitor selected would come in three basic colors (white, black, and beige), which would help camouflage the HR sensors when worn with any upper body foundation garments.

2.3 Bullying Detection System

In this section we focus on three additional machine learning systems that could aid in the detection of bullying: (1) emotion classification, (2) action classification, and (3) crowd detection. Although none of these systems have been designed to detect the specific emotional displays, actions, and crowd formations indicative of bullying, we hypothesize that each of these technologies are capable of doing so.

Emotion recognition is an important area in HCI and has applications not only in affective computing but also in telecommunications, behavioral science, computer animations, etc. Because this technology plays an important role in so many domains, facial expression recognition is an active and mature area of research. For a recent review of some of the best approaches see [5]. For our purposes, an emotion classification system would work best with cameras that provide good views of faces, such as those located in hallways/stairwells and classrooms, two locations where approximately 83 % of reported bullying in the schools takes place [27].

Anger, contempt, and fear are some of the basic facial expressions that are most likely associated with bullying. Subtle differences in other emotional displays might also be associated with bullying. For example, bullies, victims, and onlookers might all smile, but each of these smiles would differ from each other and from smiles not associated with bullying. The bully might display a more sadistic smile, the victim a forced smile intended to appease the aggressor, and onlookers a smile suppressing fear or embarrassment, and each of these smiles would differ from smiles indicative of joy and delight. There is no reason to suppose that an emotion recognition system would be unable to discriminate subtle differences in emotional displays that are associated with bullying. In [13], for instance, a system is described that succeeds in distinguishing smiles indicative of delight from those that express normal social anxiety.

Crowd detection is a rather new area of research. In the last decade, research has focused on the automated analysis of higher level crowd characteristics, such as crowd configuration [1], abnormal crowd behavior detection [30], and real-time detection of violence [10], primarily by analyzing flow [10, 30] and recently through texture analysis [20]. Crowd detection applied to bullying would look at the way students group together to both witness and engage in bullying and would be very appropriate for video of the school grounds, where some 12 % of bullying occurs.

As noted in the introduction, physical incidents (i.e., behaviors such as hitting, kicking, throwing objects, or any form of overt violence toward another student) are the third most common form of bullying [16]. One way to detect actions associated with bullying would be to train classifiers to recognize actions associated with bullying. Such a system would be useful in all school-wide locations.

Action classification, like facial emotion classification, is an active area of research but far more complicated. Some challenges specific to action classification include large variations in action performance produced by variations in people’s anatomy and spatial and temporal variations (including variations in the rate people perform actions) [6]. Distinguishing actions related to playfulness (such as wrestling and throwing objects at one another) from acts of bullying, will certainly prove an additional challenge. Solutions to this problem might include training the system to ignore actions involved in playing common children’s games, adding social information to the system (e.g., a list of friends for each student), and integrating multiple systems (e.g., emotion classification with action classification).

Another possibility that should be mentioned is action classification using motion sensors, as proposed in [33], where researchers explored using the acceleration and gyro sensors found in some smart phones for detecting some actions associated with bullying. Distinguishing bullying from nonbullying activities using these sensors is difficult and probably only the at-risk students would comply in using the bullying detection app on their phones. As with smart IDs, phones can be misplaced or left behind. However, this is an option that could be useful if used in tandem with some of the other systems utilizing video images.

2.4 System Integration

In designing our Bullying Detection/Alert system we worked with the idea of integrating as many of the systems listed above as possible. As already mentioned, some systems, such as the tracking system, could function alone to at least alert personnel of potential problems with students at risk. HR monitoring, in contrast, would most likely function as an additional variable in other systems since HR increases for many reasons having nothing to do with bullying.

As illustrated at the top of Fig. 1, the Bullying Detection/Alert system would combine the decisions of all systems, including a simple rule-based system that might fire an alert and tag appropriate video when the tracking system detects a known bully approaching a known victim. In other situations, the combined decisions of all the systems would be summed to determine when to send out an alert to personnel. Based on the decisions of the systems, probabilities could be calculated that fire different levels of alert that could be sent to different locations and personnel depending on the type of alert and the location and severity of the bullying episode. Figure 1, for example, shows an alert being sent to a playground monitor’s tablet as well as to a desktop located in some office. If bullying is detected in the classroom while the teacher is at the blackboard with her back to the class, the alert might trigger her smart watch to vibrate and even indicate, as displayed in Fig. 5, the general location in the classroom where the bullying activity is being detected. Interventions would depend upon school policy but could be as simple as a teacher turning her attention to the location indicated on her watch or for a monitor to approach the location in the playground where bullying is being detected.

Fig. 5.
figure 5

A smart watch notifying a teacher of two separate bullying incidents and their approximate locations in the classroom (back middle and front right).

Regardless of whether staff are able to intervene during a bullying episode or not, whenever bullying is detected, video from relevant cameras would be tagged and a list of bullying episodes and associated data (video and tracking information) archived. This information could be retrieved at a later time for review by school counselors and other personnel. Appropriate decisions could then be made regarding interventions (e.g., continuing to monitor the situation, changing seating arrangements in a classroom, speaking with those involved in a bullying episode, playing video back for a bully to watch and discuss, etc.).

3 Conclusion

In this paper we present a design for a bullying detection/alert system for school-wide intervention that combines wearables with HR sensors, surveillance cameras, machine learning systems, cloud computing, and mobile devices. The system alerts school personnel when potential bullying is detected, logs events, and tags video and other data for easy retrieval later by school personnel. The system identifies potential bullying by tracking and assessing the proximity of known players in bullying, by monitoring stress levels of students via HR analysis, and by recognizing actions, emotions, and crowd formations associated with bullying. Once bullying is detected, various levels of alerts produced by the system would be sent out to various devices, including smart watches, depending on both the level and the location of detected bullying events. Since all events would be logged and the videos and other data associated with the event would be tagged and archived, school personnel could review this material to implement appropriate interventions and assess the effectiveness of their whole-school strategies for reducing bullying.

Not addressed in this paper are security issues and some important ethical considerations in implementing a bullying detection system. Certainly the privacy of all associated with incidents of bullying would need to be safeguarded, and measures would need to be taken to ensure that students are allowed sufficient room for horseplay and appropriate expressions of anger and aggression. Some may fear that schools today are becoming more like prisons than institutions of learning with all the surveillance cameras and other security systems now being introduced throughout the educational system. Because the effects of violence and bullying are devastating for all parties involved, our communities may decide that cameras and machine learning systems that detect violence and school bullying outweigh some of these concerns. These systems need not be oppressive. If handled carefully, they could prevent physical and psychological harm to our children while simultaneously allowing for healthy expressions of negative affect. Reducing bullying and violence in the schools would certainly make our schools a better place for children to learn.