Keywords

1 Introduction

The use of PINs (personal identification numbers) as passwords for authentication is ubiquitous nowadays. This is especially true for banking applications where the combination of a token (e.g. bank card) and the user’s secret PIN is commonly used to authenticate transactions. In financial applications PINs are typically four-digit numbers, resulting in 10000 possible numbers. The security of the system relies on the fact that an attacker is unlikely to guess the correct PIN number and that the systems (e.g., Automated Teller Machines) limit the user to few attempts (e.g., 3) for entering the correct PIN. As most applications that use PINs for authentication operate in a public setting a common attack is to try to observe and record a user’s PIN entry (shoulder-surfing). These security problems have been recognized for a long time and researchers have proposed a number of different schemes to minimize the risk of PIN entry observation. One such proposed alternate PIN entry method requires the user to input some information, which is derived from a combination of the actual PIN and some additional information displayed by the system, instead of the PIN itself [1]. Another approach proposes the use of an elaborate hardware to make PIN entry resilient to the observation attacks [2]. However, these methods have not been introduced into practical applications because the users would have to be retrained to use a completely different approach to PIN entry and the significant additional costs involved in the hardware setup.

In the SafetyPIN project our goal is to prevent observation attacks during PIN entry while retaining the same workflow that user’s are already familiar with and with minimal additional hardware cost. Our setup can be easily integrated into existing designs of automatic teller machines (ATMs) and point of sale systems. To avoid shoulder-surfing attacks and enable users to enter their PIN without fear of being observed we have developed a system that employs an eye tracking device. With SafetyPIN, users select PIN numbers with their eyes by simply focusing on digits displayed on a screen. Since 0the physical key-press is not used for the PIN entry, no information about the entered digit is given away to the attacker through visual observation. Use of fake keypads are also rendered unnecessary.

The rest of the paper is structured as follows. In Sect. 2, some previous efforts related to preventing shoulder-surfing and use of eye interaction are mentioned. In Sect. 3, conceptual approach behind SafetyPIN is explained. Section 4 details the implementation. In Sect. 5, the initial evaluation and the results are discussed. Section 6 concludes the paper.

2 Related Work

Researchers have been evaluating the user gaze as an interaction method using eye tracking devices. In 1987, Ware et al. [3] evaluated two methods of interacting with the computers using eyes as input: dwell gaze, look and shoot. The dwell gaze method relied on the user looking at the region of interest on the screen for a certain amount of time. In the look and shoot method the user looked at the region of interest and then physically clicked a predefined button on the keypad to activate the region. The dwell gaze method needs more time for activation compared to the look and shoot method, as it needs the dwell time to ensure that spurious activations are avoided. On the other hand, the look and shoot method, though quicker, gives away more information for the potential shoulder-surfer via the button click feedback. Both these methods require calibration to be performed for the individual user.

Kumar et al. [4] evaluated the above two methods for ATM password entry to avoid shoulder-surfing. They used the Tobii 1750 eye tracker and a qwerty alpha numeric keypad for this purpose. The evaluation suggested that these methods are capable of deterring shoulder-surfers while taking comparable time for entering password as compared to conventional keypad. They also suggested that the calibration data for the user can be stored in the ATM card itself so that it need not be performed every time.

De Luca et al. [5], in addition to the methods above, introduced a gaze-gesture method of password entry and compared it with the other two methods. In the gaze-gesture method the user is required to remember a graphical pattern and then input that pattern via gesturing through his/her eyes [6, 7]. The advantage of this method is that it requires no calibration as it depends on the relative position of the eye, not the absolute position. However, it suffers on the usability front as the users need many retries to get the pattern right.

Other such efforts can be seen in [8, 9]. In SafetyPIN, we have implemented a new activation method called Blinking, along with the other two methods. In this method, unlike the look and shoot, the user looks at the region he/she wants to activate and then instead of pressing a key, blinks his/her eyes to activate the region. This is more secure than the look and shoot, since the feedback given to the shoulder-surfer via the physical pressing of the button is completely avoided. This method is also less error prone compared to the gaze method since the spurious activations are less likely.

Fig. 1.
figure 1

9-digit visual key-pad displayed on the screen, Tobii EyeX eye tracker mounted below the screen

3 Conceptual Approach

Our initial prototype runs on a standard Windows PC and uses the Tobii EyeX low-cost eye tracker. The eye tracker consists of a small bar that can be attached to a display screen and could be incorporated into an ATM at a later stage. The sensor bar contains micro-projectors that project distinct patterns of infrared light at the user’s eyes. The reflections of these patterns are then recorded by infrared cameras in the sensor bar. Through image processing the user’s eyes are detected and the eye movements tracked, which is then used to determine the user’s gaze point: the point on the screen that the user’s view currently focuses on. For the PIN input we display the possible digits on the screen (see Figs. 1 and 2).

Fig. 2.
figure 2

SafetyPIN hardware components for retrofitting into existing point of sale terminals

From the practical perspective a key advantage of the SafetyPIN approach is that only minimal additional hardware is required which fits easily into existing terminals. Because the sensor bar is small and placed directly below the screen it could be integrated into the screen housing of existing terminals, simplifying the development of new versions with integrated SafetyPIN entry and proving the opportunity to retrofit existing terminals.

4 Implementation

The prototype is aimed at mocking up a typical ATM PIN entry screen and allowing the user to enter the PIN using three different interaction methods: look and shoot, gaze activation and blink activation. The GUI is a 1680\(\,\times \,\)720 pixel window with buttons labeled 0–9 along with ‘,’ and ‘.’. The GUI buttons are 160\(\,\times \,\)100 pixels in dimension with a spacing of 50 pixels between them. The user is supposed to enter a predefined sequence of numbers as his PIN. The software then checks for the accuracy and speed of the entered PIN for each of the three entry methods mentioned above. This software has been developed in VC\(++\).

figure a
Fig. 3.
figure 3

Sequence diagram for the look and shoot activation method

figure b
Fig. 4.
figure 4

Sequence diagram for the gaze activation

The Tobii SDK [10] provides drivers for the eye tracking device along with C/C\(++\) library engine which gives API for interfacing with the device. These APIs provide functionalities higher than the raw eye position data from the device. The engine provides two kinds of high level operations. On the one hand the application program can inform the engine about the boundaries of the regions that it wants to get activated on. The engine will then intimate the application program about when the user looks at one the regions specified. This scheme relieves the application program from having to poll the incoming raw position data from the device. On the other hand, the application program can register for one of the many events for which it would like to get notifications on. For example, when the user looks at a region for more than half a second a gaze event can be notified and when the device does not see the user’s eyes for more than a second an absence event can be notified to the application program depending on whether the application program has registered for the event or not. These notifications are used in our GUI application program.

The three different methods of activation are described below with the help of pseudo-code and sequence diagrams.

4.1 Look and Shoot

The sequence diagram in Fig. 3 shows the three major modules of the software and gives an overview of their interactions. The GUI interacts with the EyeX Engine using an EyeX Interface module. The GUI initializes the windows and button components and sends the coordinates of the buttons to the EyeX Interface, which in turn requests the EyeX Engine for the periodic eye position updates. The EyeX Engine is directly communicating with the controller device. Upon receiving the position coordinates from the EyeX Engine, the EyeX Interface checks whether the position where the user is currently looking falls within the bounds of any buttons or not. If so, it requests the EyeX Engine to intimate it when the predefined activation button, right control key in this case, is pressed. Once this activation event is received, the EyeX Interface sends the message to the GUI with the button number to be activated. The GUI, upon receiving this message stores the activated PIN. Algorithm 1 depicts this process in a pseudo-code fashion.

4.2 Gaze Activation

Figure 4 shows the interaction for the gaze activation method. The major difference in this case is that upon receiving the position coordinates from the EyeX Engine, the EyeX Interface requests for the gaze event notification after performing the bounds check. EyeX Engine produces this notification if the user stares at the same region for more than half a second. But this time was found to be too small and lead to spurious activations. Therefore, the gaze time was increased to more than a second by validating it in the EyeX Interface. After this, the message is passed onto the GUI from the EyeX Interface module.

figure c
Fig. 5.
figure 5

Sequence diagram for the blink activation

4.3 Blink Activation

Figure 5 shows the interaction for the blink activation method. The major difference in this case is that upon receiving the position coordinates from the EyeX Engine, the EyeX Interface requests for the ‘user absence’ notification after performing the bounds check and temporarily saving the button number. EyeX Engine produces this notification if the device fails to see the eyes of the user for more than a second. Therefore, if the user closes his eyes for a second, he effectively becomes absent for the device, producing the required notification. Upon receiving this notification, EyeX Interface requests for the ‘user presence’ notification, effectively waiting for the user to open his eyes again, thus performing a blink operation. Once the user opens his eyes the EyeX Engine sends the required notification and the EyeX Interface sends the component number of the saved button to the GUI.

Fig. 6.
figure 6

Usability test with the user in front of the prototype system

5 Evaluation

In the initial user tests, we have examined both the technological aspects like the impact of calibration errors, the impact of glasses and other eyeware, practical usability aspects like error rates and user satisfaction, as well as the user’s perception of safety and security aspects.

The test was performed for 9 users, where each user was given a command sequence of 12 digit PIN to be entered using the eye tracker (see Fig. 6). This test was performed for all three activation methods, four times per activation method. The PIN entered by the user was stored and then compared with the command PIN to find out how many errors occurred. Time taken by the users for the test was also recorded. After the tests were completed, the users were given a questionnaire to collect the feedback from the users regarding how usable and useful did they feel the system was. Their feedback on the safety was also recorded.

Results of these tests have been very encouraging in all the three activation methods. The average error rate was around 5 % and the average time taken was around 1.6 s per a digit entry or around 6.7 s for a typical 4 digit PIN entry. Most of the errors were committed by the users when they did not remember the correct PIN and therefore entered the wrong PIN, rather than activating an unintentional PIN. Users felt that the system was easy to use and that this was a safer way to enter their pin compared to the traditional pin entry method. To draw statistically significant conclusions, we need to perform further tests with larger sample set.

6 Conclusions

In order to protect the users from shoulder-surfing in ATMs while entering the PIN, new methods of entering the PIN are being evaluated. With the eye tracking technology becoming cheaper, eye interaction for PIN entry is emerging as a practical solution. In this paper, we have discussed SafetyPIN, which proposes retrofitting the ATMs with an eye tracking device, so that users can enter their PIN without using the keypad for pin entry. In our prototype, we have implemented and evaluated the system for a PC. In addition to the ‘look and shoot’ and gaze activation methods, we have introduced a new activation method called blink activation. Initial user evaluations have yielded encouraging results, prompting further work.