1

The comparison of accuracy and precision of eye tracking: GazeFlow vs. SMI RED 250.

Document version 1.1, August, 2013

SIMPLY USER, User Experience Lab

Kraków, 6 August, 2013

2

Abstract

The report describes the results of a research comparing the Accuracy and Precision of the GazeFlow eye tracking software based on the image from webcams with the SMI RED 250 device, a standard eye tracker using infrared light to track the position of an eye. The measurement of Accuracy and Precision was taken using the method suggested by Tobii Technology.

The conclusions obtained show sufficient results in the area of accuracy and very high results in the area of precision of the GazeFlow software. This means a prospect for commercialization of the software for commercial marketing research purposes,

as well as controlling a computer with eyesight, meaning a non-contact computer interface.

3

Table of contents

1.Introduction

1.1.Eye tracking devices

1.2.Comparative tests

2.Methodology

2.1.The persons examined

2.2.The time and place of the research

2.3.Testing equipment

2.4.Experimental procedure

3.Results

3.1.The results of the Accuracy measurements for procedures with a freely kept head.

3.2.The results of the Accuracy measurements for procedures with a head fixed on chinrest.

3.3.The results of the Precision measurements for procedures with a freely kept head.

3.4.The results of the Precision measurements for procedures with a head fixed on chinrest.

4.Conclusions

5.References

6.The list of appendices

4

1.Introduction

1.1. Eye tracking devices

Eye tracking devices (eye trackers) have in recent years been extremely popular research tools. Dynamic progress and the increase of accessibility of devices (price drop) have made eye tracking research commonly used for commercial purposes. Most of all the eye tracking research has made its way into marketing research methods and website usability research.

The most widespread eye tracking devices are currently the ones using infrared light to track pupils. Among the stationary eye trackers available on the market and used

in commercial research there are two prevalent solutions coming from Tobii Technology and SMI Vision companies.

What is more, during the last two years the first commercial solutions appeared, which use webcams for eye tracking. Most often this kind of service is available online, and as with infrared-based trackers, it is used for marketing and website research.

At the moment there are three solutions available, YouEye (www.youeye.com), Gazehawk (www.gazehawk.com) and EyeTrackShop (www.eyetrackshop.com). Among the research practitioners the solutions using webcams are criticized for low accuracy and significant discrepancies of the results in comparison with solutions based on infrared light (see: Aga Bojko1).

It is important to note that eye tracking devices, especially those based on widely available solutions, have a huge potential for being used as a tool for controlling machines. Eye tracking may become one of the next means for human – computer interaction, thanks

to which it will be possible to control and interact solely with eyesight. Mostly, such solutions may be used in entertainment, for instance in games to control a character and in general gameplay, however the most promising area of application is neurorehabilitation, i.e. the prospect of using eyesight controlled computers to interact with people who lost the ability to communicate or use devices in a standard way.

The goal of the report is the comparison of the results and data obtained for two solutions: an eye tracker based on infrared light and software using a webcam.

An eye tracker based on infrared light is a standard solution of the SMI company, RED 250 is a device with high resolution equipped with software enabling efficient commercial research.

1Blog: http://rosenfeldmedia.com/books/eyetracking/blog/the_truth_about_webcam_eye_tra/

5

The second solution is GazeFlow, an authoring solution of Szymon Deja, based

on the analysis of the optical flow of webcam image. The software which currently can be used as a standalone solution, is intended for further commercialization

1.2. Comparative tests

The comparative tests have employed the solution suggested by Tobii Technology described in the Accuracy and Precision test method for remote eye trackers report (Tobii, 2011). The method suggested by Tobii is aimed to objectify and create an opportunity

to compare devices manufactured by different suppliers. The accuracy and precision of all devices manufactured by Tobii Technology is assessed according to this method.

The devices are assessed in two areas using this method: Accuracy and Precision.

Accuracy is the reading of an average difference between the position of a stimulus and the measured position of an eye. Precision is the ability of the device to repeat a reliable measurement.

The matrix shows all possibilities and relations of accuracy and precision of the measurements of eye positions made by the eye tracking device.

Only the systems with high accuracy and precision deliver reliable and adequate measurements of the position of an eye on the screen. This means that on the basis of the systems’ readings we get the information on the actual position of an eye and the measurement is repeatable. A good measurement of accuracy is considered to be an average smaller than 0.8˚ under ideal conditions (measurement taken with a fixed head, in lighting conditions of circa 300 lux).

A good measurement of precision is considered to be an average precision smaller than 0.5˚ under ideal conditions.

6

The level of required accuracy of the equipment for eye tracking depends largely

on the type of research and the kind of the analyzed stimuli. The smaller the analyzed stimuli are, or if the process of reading is also considered, the higher the requirements concerning accuracy and precision are. In case of the marketing materials commonly used in eye tracking analyses, or website researches, especially in case of remote researches regarding the free-following of stimuli by the people examined, the requirements are more liberal.

In the Accuracy and Precision Test suggested by Tobii (2011) the results of different experimental conditions are being compared, among others:

-ideal conditions

-different eye angle

-different lighting conditions

-different head positions

In case of the following research special attention was paid to different head positions due to a significant effect of a head position on the measurements taken by software utilizing webcams.

In the subject literature there is a discussion regarding taking the measurements on one dominant eye versus an average measurement for both eyes (see Tobii, 2011).

The following research employs binocular measurement.

7

2.Methodology

2.1. The persons examined

The experiment involved 30 people – 13 men and 17 women. The average age

of the subjects was 27 (the youngest person was 21 and the oldest 39). The education

of the persons examined was as follows: 14 people held a high school diploma, 12 people had master’s degree and 4 had bachelor’s degree. The summary of the subjects’ demographic data is included in Table 1. The persons examined represented a random sample of the population between 20-40 years of age. During the selection process people with visual impairment or wearing glasses were excluded from the research due to the requirement of employing eye tracking equipment using infrared light.

Table 1. The summary of the subjects’ demographic data.

8

2.2. The time and place of the research

The research took place in a laboratory of the Simple User company between 13-17 June, 2013. The room was arranged in a way which allowed for stable and controlled conditions for conducting the experiment. The subjects sat at a desk in front of a screen with

a remote eye tracking device SMI RED 250 and 4 webcams. The remote eye tracker was placed under the screen, and webcams were located both above and under the screen. Image 1 shows the monitor on which the experimental procedure was presented

to subjects. The monitor was located around 65-70 cm in front of the subjects’ eyes (the standard distance recommended in remote eye tracking research).

Image 1. The arrangement of webcams.

9

During two variants of the procedure, in order to fix a subject’s head in one place, a chinrest located 70 cm from the monitor was used. Image 2. shows the location of a subject’s head on the chinrest.

Image 2. Fixing the head in one place using chinrest.

All experiments were conducted in controlled lighting conditions. The lighting consisted of two softbox lamps. Illuminance in the room was circa 350 lux. During the experiment the lighting conditions were not manipulated. The research was conducted under

the supervision of a qualified technician and a researcher.

10

2.3. Testing equipment

The research used remote eye tracker SMI RED 250 Hz with the iView X 2.8 system (installed on a dedicated laptop being an integral part of the device). Experimental stimuli were being displayed on a 22’’ Dell LCD screen with a resolution of 1680x1050.

In addition, eyes were also tracked by the GazeFlow software using webcams. The research employed 4 differently arranged webcams:

1.Microsoft Lifecam 5000 (CAM_0) – placed above the screen

2.Logitech pro 9000 (CAM_1) – placed under the screen

3.Logitech HD Pro Webcam C920 (CAM_2) – placed above the screen

4.Microsoft Playstation PsEye (CAM_3) – placed under the screen

The webcam parameters are: resolution of 640 x 480 and 30 fps frame rate.

The experimental procedure responsible for stimuli presentation was controlled through a dedicated software using iViewX SDK on a RF711 Samsung laptop with a 17’’ screen. The computer parameters:

-IntelCore i7-2630QM 2 GHz processor

-6 GB RAM

-Windows 7 Home Premium (64-bit version) operational system

-17,3” (16:9, LCD) screen size

-1600 x 900px resolution

-32 bit color depth

The software was connected with the SMI RED 250 device through Wi-Fi. Image and data from the webcams, as well as data from the eye tracker, were recorded simultaneously.

To analyze the results a dedicated software was used as well, which calculated the Accuracy and Precision on the basis of formulae suggested in the article describing the Tobii method (Tobii, 2011).

11

2.4. Experimental procedure

2.4.1. Instructions

The experimental procedure was initiated by informing the subjects about the goal of the research, which was watching the displayed images and website dumps. The subjects were also informed about the necessity of following closely with their eyes the center of the moving point while the calibration of both eye trackers took place (the SMI device and the GazeFlow software). The instructions for subjects included detailed information on the way of moving and changing position of their heads during the calibration procedure, as well as using the feedback displayed on the screen. During the procedure the subjects were not given any additional information apart from a reminder to closely follow the point during calibration.

2.4.2. Calibration

The experiment consisted of nine experimental procedures. Each of the procedures included a combination of calibration, validation and presenting stimuli. In seven procedures the subjects had freely positioned heads, while in two they were fixed on chinrests (see Image 2).

The research employed a double calibration of the devices due to the fact of using two solutions. Below is a description of calibration for particular solutions:

1.SMI RED 250 calibration – 9-point calibration using a moving white and red point against a grey background; 4-point validation using the same point;

2.GazeFlow calibration – 10- to 30-point calibration (depending on the efficiency of the procedure) using a displayed red, pulsing point against a grey background; validation using a green point.

Additionally, in procedures with a freely kept head, the head movement was taken into account in the calibration process (calibration with a moving head), or the lack thereof (calibration with a fixed head).

Calibration accounting for the moving head was made on the basis of visual feedback information (calibration with a moving head). The subjects were presented with a yellow point with an arrow in different positions, indicating the direction of a head shift. After obtaining the desired head position the point changed its color to red and another

12

calibration point was displayed. During the introductory procedure the subjects had a chance to test this style of calibration.

During the experimental procedure the subjects were shown feedback information in case of an incorrect head position (in a form of a red head imitation), the subjects’ task was

to correct their head position in order to receive positive feedback.

2.4.3. Experimental stimuli

The research employed images from commonly accessible free databases, having

a balanced salience, as well as website dumps of the most popular websites in Poland (according to the Gemius ranking).

The display of stimuli was preceded by the presentation of a fixation point against

a background, whose color was the average of all colors of a particular image in order to balance illumination of the presented experimental stimuli.

2.4.4. Variants of the experimental procedure

All experimental procedures consisted of the following stages:

1.Instructions

2.Head initialization

3.SMI calibration

4.SMI validation

5.GazeFlow calibration

6.GazeFlow validation

7.Stimuli presentation

8.GazeFlow validation

13

The table below shows all the variants of experimental procedures used in the research.

Table 2. Conditions for all experimental procedures employed in the research.

14

3.Results

The analysis of results was done according to the Accuracy and Precision measurement methods published by Tobii (2011). According to the methodological guidelines only the results of tests with proper calibration were used in the analysis. The best level

of calibration for the GazeFlow software was reached in procedure no. 4 HeatMapWWW1, however, for the SMI RED 250 (henceforth referred to as SMI) device, in procedure no. 8 Glass_2_HeatMap_Set4. The table including the percentage data for calibration accuracy is presented in Appendix 1. Please pay attention to the lower level of the SMI eye tracker calibration, which may have been caused by the difficulties in calibrating this device due to error and delays in displaying the calibration points.

Below is the data for all four webcams used in the research. For each camera the measure of Accuracy and Precision was calculated for GazeFlow and SMI. The tables show

a detailed comparison of results. Results of the procedures with free and fixed head are presented separately.

All the results shown are expressed as a degree of deviation between the point on the screen and the eye position.

15

3.1.The results of the Accuracy measurements for procedures with a freely kept head

The tables show the results of the Accuracy measurements for GazeFlow and SMI in those procedures, where the subjects had a freely kept head, which is a standard approach in researches with a remote eye tracker.

Table 3. The comparison of average values of Accuracy on the X and Y axis for cam 1 (CAM_0) in procedures with a freely kept head in GazeFlow (WebCam) and SMI

Table 4. The comparison of average values of Accuracy on the X and Y axis for cam 2 (CAM_1) in procedures with a freely kept head in GazeFlow (WebCam) and SMI

Table 5. The comparison of average values of Accuracy on the X and Y axis for cam 3 (CAM_2) in procedures with a freely kept head in GazeFlow (WebCam) and SMI

Table 6. The comparison of average values of Accuracy on the X and Y axis for cam 4 (CAM_3) in procedures with a freely kept head in GazeFlow (WebCam) and SMI

The graph shows the summary of the average Accuracies for all cams for both solutions.

Graph 1. The comparison of the average measurements of Accuracy for conditions where the head was kept freely for GazeFlow (WebCam) vs SMI

Summarizing the test results one has to pay attention to the fact that the results

of the Accuracy measurements are slightly better for the SMI device’s the Accuracy

18

measurements on the X axis (horizontal), whereas significantly better for the Accuracy measurements on the Y axis.

In case of GazeFlow the best Accuracy results were obtained for cam 1 (CAM_0),

i.e. the one placed in the central position above the screen. In all cases better Accuracy results were obtained (lower value of deviation expressed in degrees) from cameras placed above the screen.

3.2.The results of the Accuracy measurements for procedures with a head fixed on a chinrest.

The tables gather the results for all cameras and procedures, where the subjects had their heads fixed on a chinrest.

Statyw_Heat 0,690852889 0,95462437 1,020735333 0,826349074 MapPodstaw

kaSet3

Table 7. The comparison of average values of Accuracy on the X and Y axis for cam 1 (CAM_0) in procedures with a fixed head in GazeFlow (WebCam) and SMI

Statyw_Heat 0,830127963 1,019171148 1,020735333 0,826349074 MapPodstaw

kaSet3

Table 8. The comparison of average values of Accuracy on the X and Y axis for cam 2 (CAM_1) in procedures with a fixed head in GazeFlow (WebCam) and SMI

Statyw_Heat 0,662999654 1,007571538 1,022203615 0,817623269 MapPodstaw

kaSet3

Table 9. The comparison of average values of Accuracy on the X and Y axis for cam 3 (CAM_2) in procedures with a fixed head in GazeFlow (WebCam) and SMI

Statyw_Heat 0,840606077 1,003641923 0,947704231 0,750927231 MapPodstaw

kaSet3

Table 10. The comparison of average values of Accuracy on the X and Y axis for cam 4 (CAM_3) in procedures with a fixed head in GazeFlow (WebCam) and SMI

20

Graph 2. The comparison of the average measurements of Accuracy for all conditions, where the head was fixed for GazeFlow (WebCam) vs SMI

The results in procedures with a fixed head are lower in comparison with procedures with a freely kept head, which indicates a higher accuracy of the solutions.

In case of procedures with a fixed head the results for Accuracy on the X axis in GazeFlow are comparable to the results obtained for the SMI device. It is surprising because one can expect, that in case of the fixed head condition, being close to ideal, the results

for a device using infrared light to track eyes should be better.

This may mean a high accuracy of GazeFlow, comparable to SMI under conditions close to ideal, or one should consider the possibility of an artifact stemming from a non-optimal position of the head for conditions of calibration and conducting the test in a position which the subjects were in, and whose heads were placed on chinrests, as well as a significant increase of the angle on the X axis for stimuli presented on peripheries and the position of an eye.

The Accuracy measurement results, i.e. the measurement of the eye position, are higher than the ones indicted as desirable under ideal conditions ( < 0.8˚) for procedures with a free head, especially for GazeFlow, however in procedures with a fixed head

the Accuracy measurement in both cases meets the desired value.

21

3.3.The results of the Precision measurements for procedures with a freely kept head.

The tables below contain the results of the Precision measurements for the GazeFlow software and SMI, in cases where the subjects’ heads were freely kept. The results are presented separately for each webcam.

Table 11. The comparison of average values of Precision on the X and Y axis for cam 1 (CAM_0) in procedures with a free head in GazeFlow (WebCam) and SMI

22

Table 12. The comparison of average values of Precision on the X and Y axis for cam 2 (CAM_1) in procedures with a free head in GazeFlow (WebCam) and SMI

Table 13. The comparison of average values of Precision on the X and Y axis for cam 3 (CAM_2) in procedures with a free head in GazeFlow (WebCam) and SMI

23

Table 14. The comparison of average values of Precision on the X and Y axis for cam 4 (CAM_3) in procedures with a free head in GazeFlow (WebCam) and SMI

The graph shows the summary of the Precision measurement results of all cameras for both solutions.

Graph 3. The comparison of the average measurements of Precision for all conditions where the head was free for GazeFlow (WebCam) vs SMI

24

In case of the Precision measurements, i.e. the repeatability of measurements, comparable results were obtained for both solutions. In precision measurement

on the X axis, turning off cam 4 (CAM_3), GazeFlow gets even slightly higher results than SMI. One has to pay attention to the fact that both solutions obtain measurement results indicated as the proper level of precision, even under conditions with a freely kept head. With respect to repeatability of measurements, both solutions meet the criterion

of reliability.

3.4.The results of the Precision measurements for procedures with a head fixed on a chinrest.

As it was the case previously the tables present the Precision measurement results for both solutions under conditions, where the head of a subject was placed on a chinrest.

Statyw_Heat 0,199626015 0,241200333 0,290280556 0,291698778 MapPodstaw

kaSet3

Table 15. The comparison of average values of Precision on the X and Y axis for cam 1 (CAM_0) in procedures with a fixed head in GazeFlow (WebCam) and SMI

Statyw_Heat 0,190582326 0,229813593 0,290280556 0,291698778 MapPodstaw

kaSet3

Table 16. The comparison of average values of Precision on the X and Y axis for cam 2 (CAM_1) in procedures with a fixed head in GazeFlow (WebCam) and SMI

Statyw_Heat 0,173691462 0,244160308 0,289726423 0,288761385 MapPodstaw

kaSet3

Table 17. The comparison of average values of Precision on the X and Y axis for cam 3 (CAM_2) in procedures with a fixed head in GazeFlow (WebCam) and SMI

Statyw_Heat 0,286195923 0,298856308 0,305059077 0,285907923 MapPodstaw

kaSet3

Table 18. The comparison of average values of Precision on the X and Y axis for cam 4 (CAM_3) in procedures with a fixed head in GazeFlow (WebCam) and SMI

26

The graph shows the total comparison of the average measurements of Precision for the X

and Y axis in both devices.

Graph 4. The comparison of the average measurements of Precision for all conditions where the head was

fixed for GazeFlow (WebCam) vs SMI

The results for conditions close to ideal, i.e. with a subject’s head fixed, indicate a high reliability, especially for the GazeFlow software. This means that under these conditions the software returns repeatable measurements even to a greater degree than the SMI eye tracker.

27

4.Conclusions

The results obtained in the Accuracy and Precision tests show high capability of making reliable and accurate measurements with the GazeFlow software which tracks eyesight on the basis of image from webcams. The software’s Accuracy measurements

(< 0.9˚-1.0˚) are, especially under conditions where the head is kept freely, higher than indicated in the literature as the desired level of Accuracy, however they are comparable to the device using infrared light. In procedures with a fixed head the Accuracy measurement meets the desired value.

On the other hand, the repeatability (Precision measurement) of results meets all criteria necessary to consider the software as reliable when pitched against devices employing other solutions, in this case infrared eye trackers.

One has to pay attention to the fact that different results were obtained for different types of cameras and their arrangement. The recommended position of a camera, for which more accurate and precise results were obtained, is above the screen.

The GazeFlow software can be successfully used in marketing and website researches, where one has to indicate the level of obtained Accuracy of the eye tracker.

The capabilities of the software in areas of accuracy and precision allow for its successful use as a tool for controlling computers through eyesight and application in entertainment or rehabilitation. To this extent the software is ready for commercialization.

28

5.References

Tobii Technology, Accuracy and precision test method fo remote eye trackers. Test

Specification Vertion: 2.1.1. February, 2011

6.The list of appendices

Table 1. GazeFlow's validation of calibration

Table 2. SMI's validation of calibration

3.Heatmaps

4.Scanpath

29

Appendices