Keywords

1 Introduction

Brain-computer interfaces (BCI) create a new link between the brain and an external device, which provides an alternative means of device control for people with severe paralysis [13]. The P300 is an endogenous component of the event-related potential (ERP) that reflects cognitive process such as attention and working memory [4]. With some paradigms, P300 differences can be elicited without shifting gaze, because attention can be oriented separately from eye movement [5]. This is the reason that the classical speller is considered independent [6]. However, Treder and Blankertz (2010) addressed this question by conducting experiments in both overt and covert attention conditions [7]. Brunner et al. (2010) also indicated that successful operation of the classical P300 speller substantially depends on gaze direction. However, the accuracies of the subjects were also very low [8]. Acqualagna et al. introduced and evaluated a gaze-independent BCI based on RSVP with overt attention and feature attention conditions [9]. The RSVP speller is suitable for patients with deterioration of oculomotor control. Recently, Daly et al. reported that the gaze-independent BCIs based on RSVP used dummy face pictures as stimuli obtained high performance [10]. However, in the research of gaze-independent BCIs based on RSVP, the users’ fatigue had not been taken seriously. Therefore, an improved paradigm with assessment of subjective user factors was needed, which could improve the performance of BCIs and make users more relaxed. Itier et al. reported that face-specific effects were mediated by the eye region when faces were used as stimuli [11]. That is, compared with mouths and noses, the eyes play more important roles when human faces are as stimuli in BCIs. So in terms of evoking distinct ERPs for BCIs, complete faces have no decisive advantage over faces stimuli with eyes only region [12].

In this paper, to further assess the performance of the online gaze-independent BCI system with the dummy face with eyes only approach, we explored two paradigms. These were called the “colored circle pattern (CCP)” paradigm and “dummy face with eyes only pattern (DFP)” paradigm. We assessed online and offline performance, including subjective measures (via questionnaires) as well as objective results. Two types of stimuli were colored circle and dummy faces with eyes only region, respectively.

2 Materials and Methods

2.1 Subjects

Ten healthy subjects (8 male, aged 24–28 years, mean 26 ± 1.5) participated in this study, and are designated P1 … P10. All subjects signed a written consent form prior to this experiment and were paid for their participation.

2.2 Stimuli and Procedure

Before recording began, subjects were prepared for EEG recording. We recorded from 14 EEG electrode positions based on the extended International 10–20 system [1315]. These electrodes were Cz, Pz, Oz, Fz, F3, F4, C3, C4, P3, P4, P7, P8, PO7 and PO8. The right mastoid electrode was used as the reference and the front electrode (FPz) was used as the ground. EEG signals were recorded with a g.USBamp and a g.EEGcap (Guger Technologies, Graz, Austria) with a sensitivity of 100 µV, band pass filtered between 0.1 and 30 Hz [1618].

After electrode preparation, we explained the task to the subjects. Their task was to silently count each time a specific “target” stimulus flashed. We explained that all stimuli would flash sequentially, always in the center of the monitor, and that subjects would sometimes have breaks during which the flashes stopped (see Fig. 2). During these breaks, subjects would be asked to report the number of flashes, and then the procedure would repeat with a new target (Fig. 1).

Fig. 1.
figure 1

One example of a flash in the DFP condition. The CCP condition was similar, but used colored circles without the cartoon eyes. (Color figure online)

This study used had two conditions, which differed only in the stimuli used. One condition used the colored circle paradigm (CCP), while the other condition used the dummy face paradigm (DFP). Each condition used six different stimuli (see Fig. 2). The CCP paradigm used colored circles, while the DFP condition used cartoon faces. All twelve of these stimuli were the same size and brightness, and were developed through Photoshop.

Fig. 2.
figure 2

The timing of each trial. DFP is the dummy face paradigm; CCP is the colored circle paradigm. One trial consists of 6 ‘image’ phases and 6 ‘interval’ phases, and one trial block for choosing one picture consists of several repetitions of the trial. The durations of the ‘image’ and ‘interval’ phases are 200 and 100 ms, respectively. The DFP and CCP conditions were identical in terms of timing. (Color figure online)

Throughout this paper, the term ‘flash’ refers to each brief stimulus presentation. Stimuli were flashed for 200 ms, with a 100 ms delay between flashes (see Fig. 2). Thus, each trial (meaning a sequence of six flashes) lasted 1.8 s. A trial block refers to a group of trials with the same target. During offline testing, there were 16 trials per trial block and each run consisted of 5 trial blocks, each of which used a different target. During online testing, the number of trials per trial block was adaptive, as described below.

Prior to each trial block, one of the six stimuli used in that condition was presented in the center of the screen for 3 s, and subjects were told that this stimulus was the new target. Subjects had a 5 min break after each offline run. We first completed two offline runs, and then used the data to train two classifiers (one for each condition). Subjects then had a 3 min break, followed by an online experiment for one condition, then 5 min break, and then an online experiment for another condition. The order of the conditions was determined pseudorandomly. Subjects attempted to identify 24 targets during online testing.

2.3 Data Processing

The first 800 ms of EEG after each flash was used for feature extraction. A pre-stimulus interval of 100 ms was used for baseline correction of single trials. The raw feature matrix is 14 × 204 for each single flash, since there were 14 channels and 204 samples per trial. Raw features from each channel were down-sampled from 256 Hz to 64 Hz by selecting every fourth sample from the filtered EEG. The size of the resulting feature vector was 14 × 51 (14 channels by 51 time points). No special signal pre-processing in addition to conventional filtering. We performed a third order Butterworth filter (1–30 Hz). Classification relied on Bayesian linear discriminant analysis (BLDA). BLDA is an extension of Fisher’s linear discriminant analysis (FLDA) that avoids over fitting and possibly noisy datasets. The detail of the algorithm can be found in [15]. The online runs used an adaptive strategy with a variable number of trials per average [18].

2.4 Subjective Report

After completing the last online run, each subject was asked two questions (in Chinese) about each of the two conditions. These questions could be answered on a 1–5 scale indicating strong disagreement, moderate disagreement, neutrality, moderate agreement, or strong agreement. The two questions were:

  1. 1.

    Did this paradigm make you tired?

  2. 2.

    Was this paradigm hard?

3 Results

3.1 Objective Results - ERPs

Figure 3 shows the amplitude differences for the target vs. non-target comparison for each of the ten subjects across the three peaks studied in both conditions. These differences are presented at the electrode sites selected for statistical analysis. Electrode site P7 was selected for the N200 analysis. Electrode site Pz was selected for the P300 analysis because it is commonly largest at Pz. Electrode site Cz was selected for the N400 analysis because it typically contains the largest N400. The peak values of the difference ERPs (target minus non-target) with specific time windows were compared between the two conditions over all subjects by using paired samples t-tests. The time windows were: N200 (150 to 280 ms); P300 (280 to 450 ms), and N400 (450 to 710 ms). At site P7, N200 amplitude (target minus non-target) showed no statistically significant difference between the two conditions (t = 0.62, p = 0.54). At site Pz, P300 amplitude (target minus non-target) was significantly larger (more positive) for DFP compared to CCP (t = 2.27, p = 0.04). At electrode Cz, the N400 amplitude (target minus non-target) was significantly larger (more negative) for DFP compared to CCP (t = 2.29, p = 0.03).

Fig. 3.
figure 3

Upper panel: The amplitude difference of the N200 between target and non-target ERP amplitudes at electrode P7 across 10 subjects (µV); Middle panel: The amplitude difference of the P300 between target and non-target ERP amplitudes at electrode Pz across 10 subjects (µV); Lower panel: The amplitude difference of the N400 between target and non-target ERP amplitudes at electrode Cz across 10 subjects (µV). (Color figure online)

Table 1 shows online classification accuracy, information transfer rate, and number of trials per average for each condition. Paired samples t-tests were used to examine differences between the DFP and CCP. Classification accuracy and information transfer rate were significantly higher for the DFP condition than the CCP condition (t = 2.3, p = 0.03 and t = 3.4, p = 0.003 respectively). The DFP condition required significantly fewer trials per average than the CCP condition (t = 2.4, p = 0.04).

Table 1. Classification accuracy, raw bit rate and number of trials based on averaged data from online experiments. In this table, ‘Acc’ refers to classification accuracy, ‘RBR’ refers to raw bit rate, measured in bits/min, and ‘NT’ refers to number of trials. ‘DFP’ denotes the dummy face with eyes only region pattern and ‘CCP’ denotes colored circle pattern.

3.2 Objective Results – BCI Performance

Figure 4 shows the mean classification accuracies based on offline data from each subjects using 15-fold cross-validation. The mean classification accuracy was calculated based on single-trial (not averaged) data. This analysis showed that the DFP is higher than the CCP in classification accuracy (t = 3.86, p < 0.001). The overall mean classification accuracies of all subjects based on single-trial data were 73.5 % ± 10 % (DFP) and 51.6 % ± 12 % (CCP).

Fig. 4.
figure 4

Mean single-trial classification accuracies across 10 subjects. (Color figure online)

3.3 Subjective Results – Questionnaire Replies

Table 2 shows subjects’ responses to the two questions. A paired-samples t-test was used to examine mean differences between the DFP and CCP. Subjects reported lower fatigue in the DFP condition than in the CCP condition (t = 4.98, p < 0.001). Subjects also reported that the DFP condition was less difficult than the CCP condition (t = 4.12, p < 0.001).

Table 2. Fatigue level and difficulty level of the two conditions, based on subjective report. In this table, ‘FL’ refers to fatigue level, and ‘DL’ refers to difficulty level. ‘DFP’ denotes the dummy face pattern, and ‘CCP’ denotes the circle pattern.

4 Discussions

For gaze-independent BCIs based on RSVP, users’ subjective report had not been fully considered. Therefore, we assessed users’ fatigue and perceived task difficulty, as well as objective performance. Itier et al. had proved that face elicited high event related potential were mediated by the eye region [10]. That is, the eyes may be central to what makes faces so special [10]. We suggested that the dummy face with eyes only was better suited than a completed dummy face for a gaze-independent BCI based on RSVP, since the dummy face with eyes only approach could be simpler and less distracting. However, we had not explored explore this approach online.

The offline results showed that, relative to CCP, DFP elicited a higher P300 (target minus non-target, p < 0.05) at Pz and N400 (target minus non-target, p < 0.05) at Cz. The DFP had no significant advantage in terms of the N200 (target minus non-target, p > 0.05) at P7 compared to the CCP pattern. The DFP had an advantage over CCP in terms of offline classification accuracies (p < 0.05). The online results showed that the online classification accuracies (p < 0.05) and information transfer rates (p < 0.05) of DFP were better than CCP. We also explored the number of trials required per selection in the online condition, and DFP required fewer trials per average (p < 0.05).

The main goal of this study was to identify the key characteristics of stimuli that can elicit large ERP differences for a BCI. However, the results are also interesting in their implications for human face processing. The two paradigms are not dramatically different in terms of the images presented, differing only in the inclusion of cartoon eyes in the DFP condition. Nonetheless, this small difference was enough to elicit significantly different ERP activity. Hence, this study supports the studies cites above, and numerous others, that show that face processing is a very important component of human cognition. The results also show that people are very good at recognizing abstract representations of eyes. We did not assess more realistic images of eyes (such as photographs), but even cartoon images such as those presented here are immediately and effortlessly recognizable.

5 Conclusions

An online gaze-independent BCI system with the “dummy face with eyes only” paradigm was presented in this paper and compared to a “colored circle” paradigm. The online results demonstrate that this new approach with dummy faces is viable as stimulus in a gaze-independent BCI based on RSVP. In addition to outperforming the colored circle paradigm in terms of objective measures, the dummy faces approach was also superior in terms of subjective report.