Keywords

1 Introduction

Wearable technology such as Google Glass has the capability to capture and deliver information to its user with an immediacy that surpasses the current generation of input/output devices. Combined with Google Now, Search, and a rich store of personalized, situation-sensitive data, the capability to swiftly display information enables a new genre of consumer grade human-computer interaction, where the computer ceases to be the focus, and instead becomes an inconspicuous assistant to ordinary activities. But this immediacy and amalgamation with the user carries a steep price when the bond is broken with an untimely interruption.

Delivering information at the wrong moment or delivering the wrong information for the user’s current situation can disrupt work or social interactions, and exacerbate the very problems that wearables such as Glass might solve. Muting all notifications is an option, but frequent manual muting and unmuting would itself be disruptive. Instead imagine a “magic” knob that is driven by the moment-to-moment measured interruptibility of a user. As such, Glass may attain the sweet spot of being actively informative without being overly obtrusive. It would deliver notifications to users precisely when they have the time and capacity to perceive them. For example, it would be capable of prioritizing incoming emails or social network messages to present the most important items first and defer others to a more opportune time; it could also summarize or suppress lower level detail display such as those on a map when it detects that the user is busy. However, high-priority notifications that might be important regardless of state can bypass this filter and be shown to the user immediately. In a dual-task scenario, this system can let the user focus on a primary task and only interrupt the user to work on another task when it detects the user can handle the additional effort.

This paper introduces a physiological-based notification filtering system, Phylter, that sends pertinent notifications to a user only when the user is in the proper cognitive state to handle additional information. The system uses physiological sensing as a means to time, suppress, and modulate information streams in real time. We posit that functional near-infrared spectroscopy (fNIRS), a lightweight brain-monitoring technology, has promise to control this framework because of its access to measures of blood flow in the brain, an overarching barometer of the user’s level of cognitive workload – the degree to which present engagements have posed computational demands on short term working memory. Using machine learning algorithms trained to distinguish known instances of high cognitive workload and low cognitive workload exclusively from fNIRS data, our brain-augmented Glass prototype distinguishes the neural signature of its user’s state of short-term memory cognitive workload and applies this knowledge to capitalize on the most opportune moments to deliver information.

2 Related Work

2.1 Short-Term Memory Workload and Interruptions

In the age of mobile computing and social media, interruptions from e-mail [17], instant messages [7], and other services which are granted unsupervised access to the browser, cell phone, or wearable computer’s output threaten to destabilize a user’s ability to focus on a singular task. In one study by Bailey and Constan, multitasking participants reported twice the anxiety, committed twice the number of errors, and required up to 25 % longer time on a primary task when interruptions arrived during rather than in-between tasks [5]. To mitigate the costs of interruption, research has explored a breakpoint-based method for mediating notifications in which statistical models infer likely points of transition between tasks and schedule notifications for these moments [15]. But this method requires complete knowledge about all of the user’s concurrent tasks, and assume that the sum of the user’s cognition can be understood within the digital environment. The second assumption, in particular, loses validity in a wearable computing context – where the interface is no longer the main object of the user’s attention. Horvitz and and Apacible devised a mathematical model of the cost and utility of interruptions, but their model assumes knowledge of the attentional state of the user and the the utility of interruption [13].

In cognitive science, working memory refers to the mental resources dedicated to storage, retrieval, and manipulation of information on a short timescale – measured in seconds, not minutes. It is involved in higher cognitive processes such as language, planning, learning and reasoning [3]. Some of the most popular models of working memory [29, 39] posit that the system operates under severe constraints with competition for the limited pool of resources for the numerous tasks that might at any moment engage it. A task pushing the upper-bound of working memory’s phonological loop (the working memory component engaged by subvocal mental rehearsal) may not directly undermine the processing done by the visuospatial sketchpad (another component for visual simulation and recall), but, drawing from a common pool of computational resources, two simultaneous working memory tasks nonetheless limits overall performance. Cognitive workload is dependent on the characteristics of the task, of the operator, and of the environment [40]. Working memory and executive function engage areas in the prefrontal cortex, and the amount of activation increases as a function of the number of items held in WM [23].

Research has explored the interruptibility of a user through physiological input such as heart rate variability and EEG [6] and pupil dilation [4, 16]. In these studies, the physiological sensor is calibrated to detect cognitive workload, as it has long been acknowledged that moments of low cognitive workload present the most opportune time for interruptions [21], in part since workload diminishes at task-boundaries [4]. Tremoulet et al. found that by queuing questions and alerts until the user is in a state of low workload (measured by EEG, heart rate, and galvanic skin response), they could increase the number of tasks that the user could complete, reduce error rate, and also decrease decision making time for the interrupting alert tasks [37].

2.2 Passive Brain-Computer Interfaces

Passive physiological interfaces portray the user’s present state of mind without continuously involved human effort. As such, they can supplement direct input with implicit input (derived from a physiological sensor attached to the user) and apply gleanable information to trigger adaptations that aid the user’s short-term or long-term goals. When the underlying physiological interface measures brain activity, these systems, known as passive or implicit brain-computer interfaces (BCIs), benefit the user by deducing state without additional effort on their part. In contrast to the much wider usage of brain sensors in active BCIs (where the user consciously manipulates mental activity in order to trigger an intended command) [41], passive BCIs support a practical defense mechanism towards inevitable physiological misclassifications and the small-but-not-negligible lag-time between the physical manifestation of a thought and the deliverable command [8].

In controlled experiments, real-time passive BCIs have proven to yield measurable improvements to users’ performance compared to static counterparts. Prinzel et al. used EEG signals to modulate levels of automation in simultaneous auditory and hand-eye coordination tasks [28] and Wilson and Russell used an EEG engagement index to decelerate UAVs or present alerts depending on what would most effectively sustain the user’s focus. Stripling et al. built a system where an operator could create rules for the user’s physiological state that triggered pre-recorded macros in order to manipulate a virtual environment when physiological conditions were met [36]. Recently, real-time adaptive systems have used passive fNIRS input to modify robot autonomy [33], control a movie recommendation engine [24], and modify the number of UAVs for an operator [1].

2.3 FNIRS

Propelled primarily by scientific and medical motives, brain monitors have improved dramatically in price-performance and resolution, and two devices, distinguished by their relative non-invasiveness and ease-of-use, have trickled into the field of Human-Computer Interaction, initially serving a small but significant community of disabled users. The more common tool for brain-computer interfacing, electroencephalography (EEG) provides a measurement of neuroelectrical firing in large populations of neurons situated by the scalp. EEG has high temporal resolution and is reliable for measuring responses to quick stimuli, but has low spatial resolution and suffers from motion artifacts as its reliability in depicting actual cognitive activity is undermined when the user is moving (Fig. 1).

Fig. 1.
figure 1figure 1

An fNIRS sensor with light sources and a detector.

As an alternative, functional near-infrared spectroscopy (fNIRS) measures blood-oxygenation levels in neural tissue as deep as 3 cm. The technique relies on the fact that infrared light penetrates bone and other tissues but is absorbed and scattered by oxygenated and deoxygenated hemoglobin. Conveniently, the optical properties of oxygenated and deoxygenated hemoglobin differ, and so, the relative proportion of the two can be deduced by the infrared light returned to the detector [38]. It measures the same blood-oxygenation level-dependent signal as fMRI [35], but only measures the part of the brain where the sensor is applied. FNIRS has high spatial resolution, but because the changes in blood flow take several seconds to reach the brain, fNIRS is not suitable for direct input. Instead, fNIRS can portray more stable trends in the users mental state, and can be used to distinguish workload levels [9, 12] or multitasking [2, 33].

In many cases, fNIRS continues to provide moderate descriptions of its wearer’s brain even when the user is in motion. Head movements, heartbeats, and respiration can be corrected with filters, and standard computer-interactions like typing and clicking do not interfere with the signal [10, 19, 32]. FNIRS sensors consist primarily of multiple infrared light sources (at two wavelengths to detect oxygenated and deoxygenated hemoglobin) and detectors, usually attached to a processing unit by fiber-optic cables. With advancements in signal processing, microelectronics, and wireless communications in recent years, fNIRS has become portable, supporting light sources and detectors in a self-contained unit. These wireless devices can accurately measure activity while users are performing real-world tasks such as running [18] or bicycle riding [27], with only slightly higher error rates than a traditional clinical device [34]. Feature selection can be used to improve the efficiency and accuracy of machine learning algorithms translate fNIRS signals to classifications in real time [25].

3 System Design

Phylter is a software tool that uses physiological input to schedule the delivery of notifications. It attempts to solve one of the main challenges of a wearable device (the ease with which it can bother its wearer at any moment) by using one of its key affordances (the proximity to skin and state-indicative biological markers). In technical terms, Phylter is a server with the capability to communicate with clients that deliver packets of messages and physiologically-based classifications about the user’s present state. It bases its decision to deliver the message on its specified importance (contained in the message) and prediction about the user’s interruptibility. Although Phylter generalizes to a variety of physiological sensors and any wearable devices that accept TCP/IP or Bluetooth input, the current software is calibrated to receive messages from a Unity virtual environment and physiological input for an fNIRS-based classification scheme written in Matlab; it redirects messages that meet the changing threshold to a Google Glass (a protocol shown in Fig. 2). The source code can easily be modified for other devices. Phylter builds off of the framework described by Shibata et al. but incorporates the three major components (1) subscribing or receiving passive sensor information, (2) accumulating or holding implicit input, and (3) interpreting the wearer’s state [30].

Fig. 2.
figure 2figure 2

Framework of the Phylter system. Phylter processes a continuous stream of physiological data, and when it receives a notification, it decides based on user state if it should send the notification to the user.

3.1 Physiological Input

Because individuals have different values of raw physiological signals, a back-end engine creates a machine learning classifier for each individual using our system, and then feeds real-time data into this model. We used the online fNIRS analysis and classification (OFAC) tool, shown to produce real-time classifications of fNIRS data with high accuracy [11]. Participants first complete trials of a task that stimulates known cognitive states – reference points that can later be used to determine the user’s state when the ground truth is otherwise impossible to gauge. For example, participants might complete trials of the n-back task [22], generating multiple labeled time-series which, after being described in terms of appropriate statistical features, serve as instances to the open source machine learning library LIBSVM. Trained on both high workload and low workload instances, LIBSVM ultimately allows for rapid binary classification on a moving window of time-segments in real-time. This system has been used to adapt a scenario where an interactive human-robot system changed its state of autonomy based on whether it detected a particular state of multitasking [33], measure preference signals to control a movie recommendation engine [24], and expand the motor space of high-priority targets in a visual search task [2].

The client receives a continuous stream of machine learning classifications in a string format. Each classification comes in the form of a colon-delimited string, containing the first letter of the most probable prediction as well as the associated confidence value of it and the other (potentially numerous)possibilities. Based on Afergan et al.’s [1] method of triggering adaptations from a moving window of the most recent confidence values, we store a running confidence of each classification over a user-defined period of time (typically 5–20 s). Less sensitive to erratic swings in classification, the sliding window provides a more conservative estimate of the user’s state, as a small number of misclassifications will not necessarily provoke incorrect adaptations, an important design principle to mitigate negative effects of BCIs [31].

Phylter accepts data from any number of physiological sensors without confusing the origin of any given stream, enabling sensor fusion, a popular method for merging data (or already processed predictions) from multiple devices in the hope of arriving at more accurate estimates of a particular state [14, 20]. The system is configurable to create complex rules using Boolean logic to combine the different sensor input (Fig. 3).

Fig. 3.
figure 3figure 3

Phylter screenshot. The left panel displays a stream of physiological classifications (‘l’ for low workload, ‘h’ for high workload). The center panel shows a log of notifications, if the notifications were sent, and the running average of the user’s physiological state at the time. The right panel displays a log of the notifications sent to the wearable device.

3.2 Notification Input and Output

Phylter can process notifications from an email server, a messaging service, or a custom application as long as it adheres to a basic string or XML format and includes a marker at the beginning of the packet specifying the level of notification. It handles three levels of notifications: never send (only useful for archival or experimental purposes), always send for high-priority notifications, and adaptively send for physiological-based filtering. It displays and logs when it receives notifications so that whatever system utilizes the service knows whether or not the user has received a message.

As a prototype of the wearable device that ultimately receives the message, we built a custom message handler for Google Glass that receives notifications and displays them for a set period of time before clearing the screen. If running as a background service, the application can turn on the screen to display the message, and then deactivate the screen once the notification ceases to be relevant. The message handler is built on a simple shell script that can be customized for the protocol of other wearable devices.

3.3 Server Architecture

The core functionality of Phylter relies on client-server architectures. Phylter runs two concurrent threads to receive information over TCP/IP, opening separate ports for physiological input and notifications. It acts as the server in these connections so that it can handle multiple sources for each type of input. Every time Phylter receives a new physiological classification, it updates its running average of the physiological state by discarding the oldest classification and adding this new data point. When it receives a notification marked as adaptive, it checks the user’s physiological state, and sends the notification to the wearable device via an Android device bridge communication channel triggered by a shell script.

3.4 Data Logs

Phylter records a detailed, timestamped log of its activity in plain text. It saves (in separate files) a record of all of the physiological input, as well as a list of notifications and what messages were ultimately sent to the user. This allows an operator to see the efficacy of a system and what information a user did and did not receive.

4 Discussion

As computing devices continue to battle for users’ attention, Phylter limits less significant notifications that distract the users from focusing on a single task. It serves as a framework to prevent information overload by modulating the display of notifications to the user. While our initial setup is designed for fNIRS brain data and Google Glass, it uses generic network protocols and a framework that can be extended to other input or output devices.

Phylter is composed of several self-contained systems which communicate with each other wirelessly. As these components and their requirements reduce in size and computational and power requirements, we envision that this system could become completely portable and run with commercial electronics in the near future. With future improvements to the system, Phylter could control not only the timing of notifications, but the delivery mechanism, distributing notifications across multiple wearable devices or even between devices [26] to balance user-awareness with the cost of interruption.

This software suggests an important step in physiological-based notifications, and that turning notifications on and off can make a discernible difference. In order to assess the validity of this system, we plan on running a controlled laboratory experiment to see if user performance does indeed improve by the user only receiving pertinent notifications at opportune times.