Keywords

1 Introduction

Eye movement is a new emerging modality in human computer interfaces. Wider availability of eye trackers made it more popular and some attempts have been made to use gaze for multiple user gaze based interaction. The first problem that must be overcome when working with eye movements is a correct mapping from an output of an eye tracker to a gaze point. The accuracy of the gaze point estimation is one of the main limiting factors in developing applications that utilize gaze input. This problem is especially pronounced with head-mounted eye trackers, when users may observe the screen from the non-centered position. In such setups a new type of error is introduced. This error is partially generated by the screen tracking algorithms that use IR diodes or QR codes and head mounted world camera to locate the screen in user field of view. Currently available systems provide high quality algorithms and procedures for the internal eye-tracking calibration. They, however, do not allow for the estimation of the error caused by the screen position detection algorithms error induced by mobile eye trackers and screen tracking.

2 Background and Justification

Counteracting various aspects of calibration imprecision is an ongoing effort of the eye-tracking community. Efforts have been undertaken with respect to stationary eye trackers [1]. Attempts to deal with the calibration issue in mobile eye tracking have also been started [2,3,4,5], yet the problem of calibrating eye tracker in a multi-user setup has hardly been challenged, especially for multiple eye trackers [3, 6]. Our method focuses solely on the correction of the constant offset between a perceived target and the gaze position reported by an eye tracker and does not address the issue of the offset drift [5,6,7]. However, the proposed procedure does not require any assumptions concerning the real position of the targets perceived by the user after the calibration is completed [4]. As opposed to offline correction algorithms [8, 9], our method enables online facilitation of gaze-based human-computer interaction [4, 5, 10]. The proposed simple vector design enables usage of fast linear algebra libraries implemented in low-level programming languages. Thus, this solution is ready for efficient real-time operation even on high resolution eye tracking data.

3 Method

We propose a new method of improvement of the gaze-based human computer interaction for the multiple user scenarios using: (a) a procedure to estimate the error introduced by screen tracking algorithms and non-central user position (surface recalibration) and (b) using the obtained error data to transform the eye-tracking data in real-time (data transformation).

The main part of the surface recalibration procedure is to gather the gaze data for a series of targets presented on the screen. During this procedure the series of points is presented on the screen, for 3 s each, during which the gaze position data is recorded. The recalibration errors necessary for real-time data processing are calculated using this data. Then, each gaze data point is subjected to a two-step online transformation.

First, a vector of weights of length \( n \) (where \( n \) equals the number of the recalibration points) is calculated for each of the surface dimensions \( (X,Y) \). Let \( \varvec{c}_{i} \) represent a vector of the first \( (i = 1) \) or second \( (i = 2) \) coordinates of the \( n \) recalibrations points and let \( \varvec{e}_{i} \) represent a vector of the \( n \) recalibration errors with respect to the recalibration points:

$$ \varvec{c}_{i} = \left( {c_{i1} \, c_{i2} \, \ldots \,c_{in} } \right) . $$
(1)
$$ \varvec{e}_{i} = \left( {e_{i1} \,e_{i2} \, \ldots \,e_{in} } \right) . $$
(2)

Let \( g_{ti} \) denote the \( i \)-th coordinate of the gaze position at time point \( t \) and let \( \varvec{p}_{ti} \) represent a vector of the reciprocals of the absolute values of one-dimensional distances between the gaze position and the recalibration points at time point \( t \):

$$ \varvec{p}_{ti} = \left( {p_{ti1} \,p_{ti2} \, \ldots \,p_{tin} } \right) . $$
(3)
$$ p_{tij} = 1 \div \left| {g_{ti} - c_{ij} } \right| . $$
(4)

Next the vector \( \varvec{w}_{ti} \) comprising \( n \) weights, one for each recalibration point, is calculated based on the \( \varvec{p}_{ti} \) vector:

$$ \varvec{w}_{ti} = \left( {w_{ti1} \,w_{ti2} \, \ldots \,w_{tin} } \right) . $$
(5)
$$ w_{tij} = p_{tij} \div \mathop \sum \limits_{j = 1}^{n} p_{tij} . $$
(6)

If there exists such \( j = k \) for which \( g_{ti} - c_{ij} = 0 \), then:

$$ w_{tij} = \left\{ {\begin{array}{*{20}c} {1,} & {j = k} \\ {0,} & {j \ne k} \\ \end{array} } \right. . $$
(7)

Second, the gaze position is corrected by the sum of \( n \) weighted recalibration errors. The latter operation is equivalent to calculation of the rounded difference between the gaze position coordinate and the dot product of two vectors: the vector of consecutive recalibration errors and the vector of weights obtained in previous step:

$$ r_{ti} = { \lfloor }g_{ti} - \varvec{e}_{i} \cdot \varvec{w}_{ti} { \rceil } . $$
(8)

4 Empirical Pilot Study

In order to test the developed method we conducted the empirical study comparing the effectiveness of gaze-based interaction with and without the recalibration function for simple pointing task. We hypothesized that the effectiveness of the gaze-based interaction with recalibration function would be higher, which means that the time spent on task would be lower for the task completed with recalibration.

4.1 Apparatus

Experiment was conducted using Pupil [11] mobile eye tracker running with the frame rate of 60 Hz. Experiment started with the calibration of the eye tracker using 9-point calibration procedure. After that the recalibration procedure was executed using 3 points placed in the top-left corner, center and bottom-right corner of the screen.

4.2 Experimental Task Description

During the experiment participants pointed to the series of the targets appearing on the screen. Targets were presented on the 24″ monitor screen with the resolution of 1920 × 1200 px, positioned 60 cm in front of the users. The corners of the screen had 5 × 5 markers (similar to QR Codes) placed in order to facilitate automatic screen recognition algorithms of the Pupil Capture software. Circular targets (60 px diameter) were placed on the perimeter of the circle (580 px diameter) centered on the screen. Each circle consisted of 11 targets. In order to complete a target (highlighted in red), the cursor must have stayed on the target for 0.2 s. After that the next target placed on the opposite side of the circle was highlighted. When one circle of targets was completed, i.e. all targets were pointed successfully, the next circle of targets appeared. Five circle of targets were presented for each cursor control method (Fig. 1).

Fig. 1.
figure 1

Example of one experimental trial

4.3 Cursor Control Method

The movement of the cursor was controlled by using two variants of the gaze control:

  • with the recalibration function - all incoming data was transformed in real-time using error size estimated during the recalibration procedure

  • without the recalibration function - all incoming data was not transformed and used raw.

4.4 Reaction Time (Target Acquisition Time)

For each of the trial we measured the reaction time, that is the time between the appearance of the target and the moment when it was pointed with the cursor.

5 Results

During the experimental task we gathered the data for 660 trials (330 each with recalibration and without). In the first step the data was cleared according to typical procedure for the reaction time data [12]. In effect reaction times longer than 2 standard deviation from the mean were excluded from the analyses. This left 313 observations in the recalibration group, and 310 observations in non-recalibration group. We then compared the means in two groups using t-test. Results of the analysis showed that the trial completion time was significantly lower (t(621) = 4.18, p = 0.0001) for trials completed using recalibration (M = 1.27, SD = 0.82), compared to trials completed without it (M = 1.52, SD = 0.65) (see Fig. 2).

Fig. 2.
figure 2

Effect of recalibration on the target acquisition time. Whiskers represent boundaries of confidence interval for p < 0.05.

6 Discussion

In this paper we presented a novel method of real-time transformation of the eye tracking data which may improve gaze-based interaction, especially for the mobile eye trackers and in multi user scenarios. Initial feasibility of the method was confirmed in the pilot study. The results of this study (shorter target acquisition times for tasks with enabled recalibration) clearly indicate that the tested method may effectively facilitate gaze-based human computer interaction. However further testing is necessary in order to determine the optimal parameters of the recalibration procedure. The areas that need further addressing include i.e.: influence of the number of recalibration points used in the error estimation phase, central vs. non-central user position or simultaneous registration for multiple users.

In future studies we plan testing this new recalibration method using stationary eye-trackers as it can be an effective way of facilitating gaze-based interaction by counteracting calibration errors that would traditionally yield traditional gaze-based system unusable.