Abstract
Examining user reactions via the unobtrusive method of eye tracking is becoming increasingly popular in user experience studies. A major focus of this type of research is accurately capturing user attention to stimuli, which is typically established by translating raw eye movement signals into fixations, that is, ocular events characterized by relatively stable gaze over a specific stimulus. Grounded in the argument that inner-density of gaze points within a fixation represents focused attention, a recent study has developed the fixation-inner-density (FID) methodology, which identifies fixations based on the compactness of individual gaze points. In this study we compare the FID filter with a widely used method of fixation identification, namely the I-VT filter. To do so we use a set of measures that investigate the distribution of gaze points at a micro-level, that is, the patterns of individual gaze points within each fixation. Our results show that in general fixations identified by the FID filter are significantly denser and more compact around their fixation center. They are also more likely to have randomly distributed gaze points within the square box that spatially bounds a fixation. Our results also show that fixation duration is significantly different between the two methods. Because fixation is a major unit of analysis in behavioral studies and fixation duration is a major representation of the intensity of attention, awareness, and effort, our results suggest that the FID filter is likely to increase the sensitivity of such eye tracking investigations into behavior.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
The study of eye movements in user experience research is becoming increasingly popular because eye tracking technology enables capturing the focus of a person’s gaze on a visual display at any given time. Human gaze serves as a reliable indicator of attention because it represents effort in maintaining the eyes relatively steady to take foveal snapshots of an object for subsequent processing by the brain [1]. Hence, extracting relatively stable gaze points that are near in both spatial and temporal proximity, that is translating the raw gaze data into fixations, is essential in many eye tracking studies [2, 3]. One primary method for identifying fixations in a stream of raw eye movement data is the Velocity-Threshold Identification (I-VT) algorithm. The I-VT filter uses a fixed velocity threshold to identify whether individual gaze points qualify as a fixation point, or a saccade point.
Because a fixation is the collection of gaze points that are near to one another in both time and proximity, a denser collection of gaze points within a fixation represents higher level of focused attention, and thus higher level of cognitive processing [4]. Thus, a recent study [5] proposes a new way to group gaze points into fixations based on their inner-density property. Similar to the I-VT filter, this new Fixation Inner-density (FID) filter first uses a velocity threshold to identify a candidate set of gaze points that are slow enough to form a fixation. It then uses optimization-based techniques to identify a densest fixation of gaze points among all candidate points. Identifying fixations using the FID filter naturally eliminates those gaze points that are near to tolerance settings. How gaze points are dispersed in a fixation affects fixation metrics such as the duration and center location, and there is evidence that the FID filter reduces the possibility of skewing these metrics [5].
In this paper we translate raw gaze data into fixation using the I-VT and FID filters. We demonstrate that fixations processed by the FID filter are superior in terms of three key fixation micro-patterns than those that are processed by the I-VT filter. First, they are denser. Second, the extent to which points are dispersed within a fixation is smaller. Third, the points within a fixation are more likely to be uniformly distributed. This investigation is important because the compactness and the patterns of distribution of gaze points can directly affect fixation metrics, such as fixation duration and fixation center position, that are commonly used in eye-tracking studies to assess viewing behavior. This study is the first to investigate such fixation micro-patterns or properties of the distribution of gaze points within an individual fixation.
2 Background
Raw gaze data is a sequence of \( \left( {x,\,y,\,t} \right) \) triplets, where \( \left( {x,\,y} \right) \) represents the measured location of user gaze, and \( t \) is the time stamp. Common sampling rates in eye trackers range from 30 Hz to 1,000 Hz, and a gaze sequence can easily contain tens of thousands of triplets. Gaze data is often categorized into two common types: fixations and saccades. Fixations are pauses over informative regions during eye movement; in gaze data, a fixation is where gaze point triplets aggregate together. Fixation identification methods cluster those intensive gaze points into fixations to present focused attention and cognitive effort in eye tracking research [4].
One popular fixation identification algorithm is the I-VT filter. It identifies fixations by gaze point velocity. If the velocity exceeds the predefined threshold \( V \), the corresponding gaze point is identified as a saccade, otherwise it is categorized as a fixation point. I-VT filter is efficient and practical; however, it has the drawback of ignoring the information about the spatial arrangement of individual gaze points within a distinct fixation. Some fixation metrics can express the distribution of points within a fixation. One such metric is fixation inner-density, which was introduced by [4] and further refined in [5]. Fixation inner-density represents user focus, and [4] has validated that fixation inner-density is correlated with normalized fixation duration and average pupil dilation variation during fixation. The FID filter uses optimization-based techniques to optimize for inner-density, which means that it selects a set of candidate gaze points that guarantees there is no better set with respect to the objective function of maximizing fixation inner-density. Fixation inner-density improves upon previous fixation identification methods because it combines both the temporal and the spatial aspects of the fixation into a single metric that evaluates the compactness of a fixation.
As the problem of fixation identification is a type of time-series clustering, it shares the commonality that interpreting clustering results is somewhat subjective in nature. Hence, the choice of an appropriate metric will directly affect the formation of the clusters. While density and dispersion properties can be measured in various ways, they are inherently positively related to the number of gaze points in a fixation, and negatively related to the area occupied by the constituent points. We next discuss some important metrics to evaluate density and dispersion properties within fixations.
3 Methodology
We consider two representative ways of measuring fixation inner-density, both of which are advocated in [5]. Suppose a fixation identification algorithm locates fixations in a gaze data sequence with \( T \) gaze points. For any given fixation \( f \), let \( n_{f} \) denotes the count of points inside \( f \), and let \( i \), \( j \) be any two points in \( f \). We denote the Euclidean distance between \( i \) and \( j \) as \( d_{ij} \), the minimum area box that spatially bounds the fixation as \( A_{sq} \), and the minimum area rectangle box that spatially bounds the fixation as \( A_{rt} \). The first density metric (\( D_{1} \)) is the average pairwise distance between points within a fixation.
The second density metric (\( D_{2} \)) is the minimum area square bounding box surrounding the fixation divided by the number of fixation points it contains:
For both the \( D_{1} \) and \( D_{2} \) density metrics, small values imply greater density. A third metric, Standard Distance (\( SD \)), measures the dispersion of gaze points around the fixation center. \( SD \) is a common metric in the Geographic Information System (GIS) literature, that evaluates how points are distributed around the fixation center [6]. Similar to standard deviation, \( SD \) quantifies the dispersion of a set of data values. Hence, the \( SD \) score is a summary statistic representing the compactness of point distribution. Smaller \( SD \) values correspond to gaze points that are more concentrated around the center \( \left( {\overline{X}_{f} ,\, \overline{Y}_{f} } \right) \) of fixation \( f \), expressed as (Fig. 1):
The standard distance of fixation \( f \), \( SD_{f} \), is:
Spatial pattern analysis can also be examined in measuring the fixation gaze point distribution pattern. The Average Nearest Neighbor \( \left( {ANN} \right) \) [6] is used to measure the degree to which fixation gaze points are clustered, versus randomly distributed, within a fixation bounding area. A fixation resulting from focused gaze toward a single area of interest would tend to exhibit a more uniformly distributed pattern, with greater \( ANN \) values. The \( ANN \) ratio is calculated as the average distance between each point and its nearest neighbor, divided by the expected average distance between points if a random pattern is assumed. \( ANN \) values greater than one imply that the fixation gaze points are dispersed; as this ratio decreases, fixation gaze points increasingly exhibit clustering (Fig. 2).
The four metrics \( D_{1} , \,D_{2} \), \( SD \) and \( ANN \) will be used to evaluate three aspects of inner fixation patterns: fixation inner-density, fixation points dispersion, and their distribution. We expect fixations identified with the FID filter to be denser and more uniformly distributed than those identified with the I-VT filter. Our density assertion, which stems from the method of fixation identification, helps to test whether the FID filter does indeed more accurately group individual gaze points into focused attention. Our assertion that gaze points identified with the FID filter are more randomly distributed stems from the argument that if a fixation is compact, that is it has high inner-density, it is more likely to have a more uniform distribution around its center.
In addition to the above assertions, we also examine the impact of FID and I-VT filters on fixation duration and center location.
4 Experimental Evaluation
We begin this section by describing the specific context of our eye tracking datasets and experiments. We then compare the I-VT and FID filters with the aforementioned four metrics, and discuss our findings.
4.1 Dataset and Equipment
We perform our experiments on eye movement datasets obtained from a total of 28 university students who were assigned to read a text passage shown on a standard desktop computer monitor. Prior to the experiment, each participant completed a brief eye-calibration process lasting less than one minute. We used the Tobii X300 eye tracker [7] to collect participant’s eye-movements. The software version is 3.2.3 and the sampling rate was set to 300 Hz.
The 28 recordings were further analyzed using an Intel core i7-6700MQ computer with 3.40 GHz and 16.0 GB RAM running 64-bit Windows 10. Matlab 2016a and Python 2.7 were used for additional data analysis and processing.
4.2 Data Processing
For each eye tracking record, we used the Tobii Studio I-VT filter [8] to generate I-VT fixation identification results. The velocity threshold \( V \) was set to 30°/s, which is the recommended threshold in [8]. The minimum fixation duration is set to 100 ms which is the theoretical minimum fixation duration suggested by other eye tracking studies [9, 10].
We further used the results of the I-VT fixation identification as the input data chunks for the mixed integer programming formulation (MIP) for minimizing square area of fixations from [5]. The Gurobi Optimizer 7.5.1 [11] is used as the solver. The FID filter is parametrized by a manually assigned constant α that enables decision-makers to have fine-tuned control over the density. We varied α from 0 to 1 by steps of 0.1 on one randomly selected eye tracking record and examined the fixation identification results manually. When \( \upalpha\, = \,0.1 \), the clustering result appeared the most reasonable, and averaging \( D_{2} \) values over all fixations yielded the smallest value, suggesting the algorithm finds the (averaged) densest fixations at \( \upalpha\, = \,0.1 \) comparing to other α levels. Therefore, we set \( \upalpha\, = \,0.1 \) when running the FID filter on the other 27 records. In the following evaluations, we discard the record used for selecting \( \upalpha \) to avoid data snooping.
4.3 Experimental Results
After discarding the single record above, in this section we first report our statistical analyses from the point of view of a single record. Subsequently, we expand it to all 27 of the (remaining) records in our dataset.
4.4 Comparing I-VT and FID Filters for a Single Record
Fixation inner-density and the distribution of gaze points within an individual fixation are micro-patterns in gaze data. Such patterns are relatively difficult to evaluate by averaging over all eye tracking records. To more thoroughly investigate micro-patterns, we first illustrate the comparison results on the eye tracking record of one randomly selected participant. Toward the end of this section, the comparison summary over all recordings is also included.
For this gaze data record, there are 9,788 gaze points and 110 fixations. We calculated fixation inner-density metrics \( D_{1} \) and \( D_{2} \) on each individual fixation. The resulting average of both \( D_{1} \) and \( D_{2} \) from the I-VT filter is larger than that of FID, which indicates that fixations from the FID filter are denser than those in I-VT filter result. We performed a paired t-test with the following hypothesis:
The t-test on both \( D_{1} \) and \( D_{2} \) returns a \( p \)-value smaller than 0.05, so at a 95% confidence level we reject \( H_{0} \), which implies \( \overline{D}_{I - VT} \) is statistically larger than \( \overline{D}_{FID} \) (Table 1).
The \( SD \) metric measures the dispersion of fixation points around their center. Table 2 reveals that the \( SD \) mean and standard deviation for the I-VT filter are larger than that of the FID filter. We also performed a paired t-test when comparing the \( SD \) metric. The hypotheses are:
With the same 95% confidence level as the previous test, the t-test result rejects the \( H_{0} \). It indicates that the FID filter tends to identify fixations having points that are more dispersed around the center. It further demonstrates that identifying fixations by optimizing for fixation inner-density yields fixations with more compact regions.
Finally, we perform a hypothesis test using the \( ANN \) ratio [6] to see if the gaze points are randomly distributed in a fixation region:
- \( H_{0} : \) :
-
gaze points are randomly distributed within fixation region,
- \( H_{a} : \) :
-
gaze points are not randomly distributed within fixation region.
If the hypothesis test results in a small \( p \)-value, we would reject the \( H_{0} \) because of the small probability that the fixation gaze points are randomly distributed in their fixation region.
The \( ANN \) hypothesis test is rather sensitive with respect to the bounding region used to cover all fixation points in an individual fixation. Therefore, we perform two experimental results using \( A_{sq} \) and \( A_{rt} \), respectively, to represent fixation area. Table 3 reports the count of fixations (out of 110) for which \( H_{0} \) is rejected at 95% confidence level, implying that there is statistical evidence that fixation points are not randomly distributed. Table 3 reveals that, under both fixation regions, more fixations appear to not be randomly distributed when using the I-VT filter. Moreover, the difference between the I-VT and FID filters is greater under the \( A_{sq} \) region. This may be due to \( A_{sq} \) typically being larger than \( A_{rt} \), as the FID filter specifically minimizes the square area of fixations.
We now compare fixation duration and fixation center for the I-VT and FID filters. Fixation duration (\( FD \)) is a commonly used metric in eye tracking research. We compare the average fixation duration on I-VT and FID filters with the hypotheses that
The paired t-test result shows that \( \overline{FD}_{FID} \) is significantly smaller than \( \overline{FD}_{I - VT} \) at a 95% confidence level. This outcome may be due to the FID filter eliminating fixation points and refining the fixation region of each of the fixation chunks from the I-VT filter (Table 4).
Fixation center is also a basic feature to represent fixation location, used in the depiction the scan path of eye movement. We introduce the center shift, which is the Euclidean distance between the fixation center of the I-VT filter and that of the FID filter. The 110 fixations within the eye tracking record generates mean and standard deviation (STD) of the center shift data as reported in Table 5.
When examining the mean and STD of center shift, it may be inferred that the difference of fixation center is negligible. The bivariate distribution of center shift depicted in Fig. 3 displays the long tail distribution in both x and y axis. The 90% quantile of \( x \), \( y \) is 0.922 and 1.308 respectively. It shows that while the refined results of the FID filter can skew some I-VT fixation centers, most of the time the center shift remains in a fairly small range.
4.5 Comparing I-VT and FID Filters for all 27 Remaining Records
The results reported above were for a single eye tracking record. The average number of gaze points for all remaining 27 records is 10,959, and the average number of fixations is 127.7. Table 6 reports the results of the corresponding hypothesis tests for \( D_{1} \), \( D_{2} \), \( SD \) and fixation duration on all the 27 eye tracking records. We find that zero record does not reject the corresponding \( H_{0} \) in the t-test for \( D_{1} \), \( SD \) and fixation duration, and two for \( D_{2} \). This analysis shows that the FID filter finds denser and more compact fixations than I-VT filter holds for most of eye tracking records in our dataset in terms of for \( D_{1} \), \( D_{2} \) and \( SD \).
We calculate the center shift between all I-VT and FID filter fixation pairs; the bivariate distribution result is shown in Fig. 4. The distribution on either \( x \) or \( y \) direction is again a long tail distribution. The 90% quantile value of \( x \), \( y \) is 2.095 and 2.411 respectively. Figure 4 shows only a few points that are far away from the origin, indicating that the FID filter identification results can indeed change the fixation center location, though this occured relatively infrequently in our dataset.
We also run the \( ANN \) hypothesis test on each recording and calculate the count of fixations (\( FC \)) for which the \( ANN \) hypothesis test \( H_{0} \) (\( FC - ANN \)) is rejected over all recordings. The average is reported in Table 7. Both the mean and the standard deviation resulting from the FID filter are smaller than that of the I-VT filter.
We compare the \( FC \) results from the I-VT and FID filters by the paired t-test with 95% confidence level and the following hypotheses:
The first row in Table 7 shows that when bounding the fixation region by \( A_{sq} \), \( \overline{FC}_{FID} \) is significantly smaller than \( \overline{FC}_{I - VT} \). It indicates the general trend that the inner gaze points of fixations resulting from the FID filter tend to be randomly distributed. As for \( A_{rt} \), the t-test result also reject \( H_{0} \), implying that the same conclusion could be drawn on \( A_{rt} \).
5 Conclusions
Our results show that the FID filter, as compared to I-VT filter, does indeed identify fixations that are denser and more compact around the center, and more uniformly distributed patterns found in fixation bounding regions. These properties have major implications for two important fixation metrics that are widely used in eye tracking analysis: Fixation duration and location. Our results show that the two filters tend to result in significantly different fixation durations. The results displayed in Figs. 3 and 4 provide evidence that in some cases FID filter can result in quite different fixation centers comparing to I-VT filter. It is important to note that the data used in our study was gathered when users were reading an online text passage, which typically generates more focused fixations. Future investigation using different stimuli are needed to extend the generalizability of these results and to see whether the micro-level differences, including fixation duration and center location, observed in this study between FID and I-VT filters change for different tasks (e.g., reading more challenging text passages, viewing a picture, or browsing a website). For example, in this study we used a reading task which typically results in compact fixations. Using a browsing task may result in much larger differences in fixation center location, because gaze points within fixations in browsing tasks tend to more dispersed [5]. The metrics introduced in this study to compare fixations at a micro level serve to refine the analysis of eye movements to a deeper level. Future studies, however, are needed to validate and extend our findings.
The results of this study contribute in two ways to eye tracking studies that examine user behavior. First, they show that researchers can identify focused attention with the FID filter and thereby improve the sensitivity of their analysis with regard to duration and center location of intense attention. Second, the micro-analysis introduced in this study provides a new way to compare gaze points within a fixation. This is important because it allows researchers to examine relationships between eye movements and behavior at a much smaller unit of analysis, namely fixation micro-patterns.
References
Djamasbi, S.: Eye tracking and web experience. AIS Trans. Hum.-Comput. Interact. 6(2), 37–54 (2014)
Nyström, M., Holmqvist, K.: An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data. Behav. Res. Methods 42(1), 188–204 (2010)
Salvucci, D.D., Goldberg, J.H.: Identifying fixations and saccades in eye-tracking protocols. In: Proceedings of the 2000 Symposium on Eye Tracking Research & Applications, pp. 71–78. ACM, November 2000
Shojaeizadeh, M., Djamasbi, S., Trapp, A.C.: Density of gaze points within a fixation and information processing behavior. In: Antona, M., Stephanidis, C. (eds.) UAHCI 2016. LNCS, vol. 9737, pp. 465–471. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-40250-5_44
Trapp, A.C., Liu, W., Djamasbi, S.: New density-based optimization formulations and algorithms to identify fixations in gaze data. In: Presented in INFORMS Annual Meeting, Houston, TX, INFORMS, Hanover, MD (2017)
Mitchell, A.: The ESRI Guide to GIS Analysis, Volume 2: Spatial Measurements and Statistics. ESRI Guide to GIS analysis (2005)
Tobii: Tobii technology (2017). http://www.tobii.com. Accessed 18 Dec 2017
Olsen, A.: The Tobii I-VT fixation filter. Tobii Technology (2012)
Blignaut, P.: Fixation identification: The optimum threshold for a dispersion algorithm. Atten. Percept. Psychophys. 71(4), 881–895 (2009)
Komogortsev, O.V., Gobert, D.V., Jayarathna, S., Koh, D.H., Gowda, S.M.: Standardization of automated analyses of oculomotor fixation and saccadic behaviors. IEEE Trans. Biomed. Eng. 57(11), 2635–2645 (2010)
Gurobi Optimization, Inc.: Gurobi Optimizer 7.5.1 Reference Manual (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Liu, W., Djamasbi, S., Trapp, A.C., Shojaeizadeh, M. (2018). Measuring Focused Attention Using Fixation Inner-Density. In: Schmorrow, D., Fidopiastis, C. (eds) Augmented Cognition: Users and Contexts. AC 2018. Lecture Notes in Computer Science(), vol 10916. Springer, Cham. https://doi.org/10.1007/978-3-319-91467-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-91467-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-91466-4
Online ISBN: 978-3-319-91467-1
eBook Packages: Computer ScienceComputer Science (R0)