Research Article

Frequent Pattern Mining of Eye-Tracking Records

Partitioned into Cognitive Chunks

Noriyuki Matsuda

and Haruhiko Takeuchi

Department of Social Systems & Management, University of Tsukuba, Tsukuba 305-8573, Japan

National Institute of Advanced Industrial Science & Technology (AIST), Tsukuba 305-8566, Japan

Correspondence should be addressed to Haruhiko Takeuchi; [email protected].jp

Received  July ; Accepted  October ; Published  November 

Academic Editor: Yongqing Yang

License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Assuming that scenes would be visually scanned by chunking information, we partitioned xation sequences of web page viewers

into chunks using isolate gaze point(s) as the delimiter. Fixations were coded in terms of the segments in a 5×5mesh imposed

on the screen. e identied chunks were mostly short, consisting of one or two xations. ese were analyzed with respect to

the within- and between-chunk distances in the overall records and the patterns (i.e., subsequences) frequently shared among the

records. Although the two types of distances were both dominated by zero- and one-block shis, the primacy of the modal shis was

less prominent between chunks than within them. e lower primacy was compensated by the longer shis. e patterns frequently

extracted at three threshold levels were mostly simple, consisting of one or two chunks. e patterns revealed interesting properties

as to segment dierentiation and the directionality of the attentional shis.

1. Introduction

Eyes seldom stay completely still. ey continually move even

when one tries to xate one’s gaze on an object because of the

tremors, dris, and microsaccades that occur on a small scale

[]. Hence, researchers need to infer a xation from consecu-

tive gaze points clustered in space []. We may regard such a

cluster of gaze points as a perceptual chunk, a familiar term in

psychology aer Miller [] in referring to a practically mean-

ingful unit of information processing.

During xation, people closely scan a limited part of the

scene they are interested in. ey then quickly move their eyes

to the next xation area by saccade, which momentarily dis-

rupts vision. However, it normally goes unnoticed thanks to

our vision system that produces continuous transsaccades

perception [–].Itmeansthatsuccessivexationsconstitute

a higher order chunking over and above the primary chunk-

ing of gaze points. Put metaphorically, the gaze,xation,

xation-chunking relationship is analogous to the letter,

word,phrase relationship. For the sake of brevity, a chunk

of xations will be referred to as a chunk.

In viewing natural scenes or displays, a chunk continues

to grow until interrupted by one or more isolate gaze points

resulting from driing attention or by accident. ese do not

participate in any xation. Whatever causes the interruption,

we believe that such isolate points serve as chunk delimiters,

likethepausesinspeech.Asapausecanbeeithershortor

long, interruptions by isolate points can vary in length. Figure 

illustrates two levels of chunking: (a) chunking of gaze

points into xations and (b) chunking of consecutive xations

with and without interruption.

Granting our conjecture, one may still wonder what

particular merits will accrue from the analysis of chunks in

lieu of ordinary plain xation sequences. e expected merits

are twofold: separation of between- and within-chunk patterns

and extraction of common patterns across records. Neither of

these is attainable when dealing with multiple records by heat

maps of xations accumulated with no regard to sequential

connections [], by network analysis of the adjacent transi-

tions accumulated within and between records [–], or by

scan paths that would be too complicated [] unless reduced

to frequently shared subpaths. e key to understanding this

Hindawi Publishing Corporation

Applied Computational Intelligence and So Computing

Volume 2014, Article ID 101642, 8 pages

http://dx.doi.org/10.1155/2014/101642

 Applied Computational Intelligence and So Computing

(a)

(b)

F : Two xations in one chunk (a) and in separate chunks (b).

point lies in the structure of xation sequences as explained

below.

1.1. Structure of Fixation Sequences. Equation () presents two

types of xation sequences, one plain and the other parti-

tioned,botharrangedintimesequence.eformerservesas

a basis for heat maps, scan paths, and network analysis. e

latter incorporates chunks delimited by isolate gazes or any

other appropriate criterion. e essential nature of the sequ-

ences remains the same when xations are coded in areas of

interest (AOI) or grid-like segments.

Plain and Partitioned Fixation Sequences. Consider

Plain: 



⋅⋅⋅ 

𝑖



𝑖+1



𝑖+2

⋅⋅⋅



Partitioned: 



,

,...,



𝑖



𝑖+1



𝑖+2

⋅⋅⋅

,...,

()

where 

𝑖

denotes the th xation.

Although it was not explicitly stated, McCarthy et al. []

in eect extracted chunks from partitioned sequences in their

work on the importance of web page objects and their loc-

ations. ey grouped consecutive xations within each AOI

into a chunk called a glance to obtain plain sequences of

glances coded in AOI. eir interest was to see how oen

areas of web pages would attract glances by varying the area

locations and the types of tasks.

By focusing on the frequency of glances as an indication

of importance, they disregarded the length of the chunks, that

is, the number of xations within glances. Also disregarded

was the shi of glances, that is, between-chunk sequences. To

us, both within- and between-chunk patterns seem to contain

rich information worthy of investigation. e information

can be extracted from partitioned sequences but not from

plain ones. In addition, partitioned sequences will be of great

value when some AOIs are nested into broader AOIs (see

[]), given the appropriate coding. e present study is

extensible to such a hierarchical structure.

For the sake of simplicity, we will focus on the eye move-

ments of web page viewers, and we will assume that the pages

are divided into grid-like AOIs, that the xations are coded

in terms of the areas in which they fall, and that chunks are

delimited by isolate gaze points.

1.2. Shis of Interest within and between Chunks. e distance

between two successive xations indicates how far the interest

T : Pattern extraction by prex “”atms.

Record

Initial sequences

Patterns

prexed by 



[][][][][] [][][][]



[][][][] [

][][][]



[][][][][] [

][][][]



[][][][][][] [

][][][]

frequent code

/4,/4/,/4,/4,

/3,/3

/4,/4

Note.eunderscore means that the prex was present in the chunk; for

example,

𝑏 implies 𝑎𝑏.

shied or did not shi in a looped transition that represents

sustained interest in a given area. In our view, a chunk of x-

ations reects continuous interest, and a new one begins aer

a momentary dri of the gaze. It seems natural to expect the

distance distribution of the within-chunk shis to dier, to

some extent, from that of the between-chunk shis.

e distance analysis explained above exploits informa-

tion from the cumulative records across all viewers. Hence, it

is possible that the results are inuenced by some dominant

patterns in particular records. If one is interested in sequential

regularities oen shared among records, frequent sequential

pattern mining is useful, as explained below.

1.3. Frequent Sequential Pattern Mining. Among others, we

will employ PrexSpan, developed by Pei et al. [, ],

because of its conceptual compatibility with the partitioned

sequences of eye-tracking data. eir approach is briey

explained below using their example, seen in Table .(Seethe

Appendix for a more formal explanation.) One can view the

data as the eye-tracking records of four viewers in which x-

ations are alphabetically coded according to the areas of

interest (AOI) they fall into: , , , , , ,and.

Codes , , , , ,and areallfrequent,sharedbythe

majority, whereas code  is infrequent, appearing only once.

For further scanning, any infrequent or rare code is to be

removed from the records, since it will never appear in frequ-

ent patterns according to the aprioriprinciple []. Let us

set the level of being frequent at three for illustrative pur-

poses. is level is called the minimum support threshold

(abbreviated as ms).

Applied Computational Intelligence and So Computing 

T : All of the frequent patterns extracted from Table  at ms.

[a][a][b][a][c] [][][] [][][] [b] [][]

] [][] [][] [d ] [][] [] []

Note. Underscored patterns were extracted at ms.

For every frequent code, one scans the reduced records,

devoid of infrequent codes, for patterns prexed by the given

code. ose found for prex “” are listed in the second

column of Table .esearesubjecttofurtherscanningwith

respect to the frequent codes at this step, that is,  and .

is process recursively continues until no code is frequent or

no patterns remain in the records. Note that prexes grow in

each step like “[][]”, “ [][]” in the above example (see the

Appendix for a more formal explanation).

Table  lists  frequent patterns extracted at ms from the

initial record, including those found at ms as an embedded

part. For instance, [][], found at ms, is embedded in

patterns [][] and [][][],atms;thatis,

[



][



]

⊆

{

[



][



]

[



][



][



]

}

()

Similarly, those found at ms are included in the patterns at

ms reported by Pei et al. [, ]. Inclusive relations generally

hold between dierent ms levels.

Ordinarily, one nds too few patterns at a high ms level

andtoomanyatalowleveltomakeaninterestinganalysis.

However, once one recognizes the inclusive relations, making

use of multiple levels becomes a plausible solution for iden-

tifying strongly frequent patterns as opposed to mildly and

weakly frequent ones. (See the Appendix for the relation

networks among the patterns identied at ms, ms, and

ms.)

e present approach is expected to advance eye-tracking

research along with conventional heat maps, scan paths, and

network analysis recently developed by Matsuda and

Takeuchi [–].

2. Method

2.1. Subjects (Ss). Twenty residents (seven males and 

females) living near the AIST Research Institute in Japan were

recruited for the experiments. ey had normal or corrected

vision, and their ages ranged from  to  years (average ).

Ten of the Ss were university students, ve were housewives,

and the rest were part-time workers. Eleven Ss were heavy

Internet users, while the rest were light users, as judged from

their reports about the number of hours they spent browsing

online in a week.

2.2. Stimuli. e front (or top) pages of ten commercial web

sites were selected from various business areas: airline com-

panies, commerce and shopping, and banking. ese were

classiable into three groups according to the layout types [–

]. Due to space limits, we chose four pages with the same

layout,thetopandtheprincipallayers.eprincipallayers

were divided into the main area in the middle and subareas

on both sides. e layers and the areas diered in size among

pages.

A1 A2 A3 A4 A5

B1 B2 B3 B4 B5

C1 C2 C3 C4 C5

D2 D3 D4 D5

E1 E2 E3 E4 E5

F : Segment coding.

2.3. Apparatus and Procedure. e stimuli were presented

with  × pixel resolution on a TFT 

󸀠󸀠

display in a Tobii

 eye-tracking system at a rate of  Hz. e web pages

were randomly displayed to the Ss one at a time for  sec. e

Ss were asked to browse each page at their own pace. e

translated instructions are “Various web pages will be shown

on the computer display in turn. Please look at each page as

you usually do until the screen darkens and then, click the

mousebuttonwhenyouarereadytoproceed.”eSswere

informed that the experiment would last for approximately

ve minutes.

2.4. Segment Coding. A× meshwassuperposedonthe

eective part of each page, aer the page was stripped of white

margins that had no text or graphics. A uniform mesh was

employedforeaseofcomparisonamongpagesthatvariedin

design beyond the basic layout. e distance of a shi

between two segments was measured by the Euclidean dist-

ance, computed as the square root of 

+

,where and 

are the number of blocks (i.e., segments) moved along the

horizontal and vertical axes.

e rows (and columns) of the mesh were alphabetically

(and numerically) labeled in descending order: A through E

(and  through ). e segments were coded by combining

these labels as seen in Figure :A1,A2,...,A5 for the rst

row; B1,...,B5 for the second; and so on through E1,...,E5

for the h row.

2.5. Fixation Sequences. e raw tracking data for each sub-

jectconsistedoftime-stampedgazepointsmeasuredin-

coordinates. e gaze points were grouped into a xation

pointiftheystayedwithinaradiusofpixelsformsec.

Otherwise, they remained isolate.

Each xation was then translated into code sequences

according to the segments in which the xation fell. Finally,

each xation sequence was partitioned into chunks using the

isolate gaze points as delimiters.

 Applied Computational Intelligence and So Computing

2.6. Preprocessing the Codes for PrexSpan. In accord with

the algorithm,  segments were rst recoded using letters 

through ; then the codes in each chunk were alphabetically

ordered with no duplication. In this process, we repre-

sented within-chunk loops by extra recoding. Consecutively

repeated codes within a chunk were replaced by the corre-

sponding capital letter, for example, [caaababaa] to [cAbabA].

Aer eliminating duplicates, we sorted the codes within each

chunk, for example, [Aabc] from the original sequence. Con-

sequently, we maintained the sequential order among chunks,

but the within-chunk sequences could have been distorted.

Due to this possibility, we were unable to identify between-

chunk loops.

Frequent patterns were extracted at three levels of mini-

mum support (denoted as ms, ms, and ms) correspond-

ing to , , and % of the subjects.

3. Results

e four pages used as stimuli will be referred to as P, P, P,

and P.

3.1. Examination of the Chunks. e total number of chunks

did not greatly dier among pages, ranging from  (P) to

 (P). e pages agreed well on the lengths and pro-

portions of primary, secondary, and tertiary chunks that

contained one, two, and three xations, respectively. Primary

chunks accounted for . (P) to .% (P) of the total

chunks, and secondary chunks accounted for . (P) and

.% (P). Putting the primary and secondary chunks

together, the vast majority of the chunks (≥.%) were very

short. e proportions of the tertiary chunks were much

smaller, ranging from . (P) to .% (P). e longer

chunks accounted for . (P) to .% (P).

e primary shis of transitions within double-xation

chunks were loops (distance = ) across pages. ese

accounted for . (P) to .% (P). e pages agreed also

on the secondary (



1)andtertiary(



2)distances,which

involved adjacent segments connected laterally (or vertically)

anddiagonally,respectively.eproportionoftheformer

ranged from . (P) to .% (P). In contrast, that of

the latter was much smaller (≤.%). Put together, the over-

whelming majority of the double-xation chunks (≥.%)

were homogenous, that is, loops, or minimally heterogeneous

(



1).

Loops and one-block shis were also dominant among

the chunks of length three or more. Loops accounted for

. (P) to .% (P) of the shis, and one-block shis

accounted for . (P) to .% (P) of them. Putting these

together, the overwhelming majority (≥.%) of the shis

within longer chunks were extremely short in distance.

Similarly, extremely short shis (≤



) were modal among

between-chunk transitions in reverse order and less promi-

nent than within-chunk transitions. Primary one-block shis

accounted for . (P) to .% (P) of the total between-

chunk shis, and loops accounted for . (P) to .% (P).

eir combined proportions ranged from . (P) to .%

(P).

T : Number of patterns () by length (len) by ms.

len

ms ms ms

 Loops  Loops  Loops

P

 B/  

  

P

  A/, B/, D/  B/  B/

  B/  B/ 



P

  A/  A/  A/

  A/  A/  A/

 A/

P

  A/  A/ 

 A/  

Note. e length of a pattern (len) is the number of constituent chunks. Also

listed are the identied within-chunk loops with the number of patterns in

which they appeared.

e low prominence of the rst two modal shis was

compensated by the relatively large proportions of the longer

ones. Each of the two-block shis (



2 and



4) exceeded %

levels on all pages with the exception of .% (



2)onP.

Compared to the paucity of shis of three blocks (



5)within

chunks (≤.%), the corresponding distance between chunks,

which ranged from . (P) to .% (P), was noteworthy.

Similarly noticeable was the size of the long-distance shis

(≥



), which ranged from . (P) to .% (P), while such

shis were nonexistent or negligible (.% on P) within

chunks.

3.2. Examination of the Frequent Patterns. e frequent

patterns extracted at three dierent ms levels (ms, ms,

andms)areinclusivewithineachpageinthesensethat(a)

subpatterns of a frequent pattern are also frequent at a given

level and (b) the patterns extracted at a higher level are

included in those at a lower level. For the sake of simplicity,

the term “frequent” will be omitted below when obvious.

Prior to mining, special coding was applied to the within-

chunk loops as explained in Section .

As seen in Table , the patterns were generally short,

consisting of one or two chunks across pages at all ms levels.

e longer ones (one on P and ve on P), all of length three,

were found only at ms. e constituent chunks were simple

in composition, being a single xation or a single loop. e

loops were limited to (AA), (BB), and (DD), all located

intherstcolumnofthemesh.(esewillbedenotedas

(A..), (B..), and (D..).) e (D..) loop appeared only on P

at ms by itself, unaccompanied by any other chunk. (B..)

appeared alone on P at ms and on P at all ms levels. Also,

it was paired with B on P both as a prex (ms) and as a

postx (ms and ms). (A..) appeared by itself on P

(ms), P (ms, ms, and ms), and P (ms and ms)

andalsoasaprextootherchunk(s)onP(ms,ms,and

ms) and P (ms). None of the corresponding segments

were in the rst column. e postxes on P were A (ms,

ms, and ms); A and B (ms and ms); and B, B, B,

Applied Computational Intelligence and So Computing 

B, C, and D (ms) in addition to AA and BB (ms).

oseonPwereB,C,C,D,andD(ms).

In the six patterns of length three found on P and

P at ms, the constituent codes were partially or totally

homogenous. Five of them contained two repeated codes,

either A or B, including those prexed by (A..) as reported

above. e remaining one, found solely on P, contained A.

In the following examination of the double-chunk patterns,

loops will be treated as single codes to reduce complexity.

e double-chunk patterns are listed in Table  by the

direction of the sequences—upward, homogenous, horizon-

tal, and downward. Superscripts L and R denote leward and

rightward sequences. Underscored patterns were extracted

atmsandabove.osefoundonlyatmsarefurther

emphasized in italicized bold face. e total number of

patterns varied from  (on P and P) to  (P).

At ms, the patterns were homogenous (BB on P;

BB on P), horizontal (AA on P; CC on P), or

downward (AB on P) sequences with the exception of

down-rightward pattern AB on P. ere was no leward

heterogeneous pattern.

e new patterns found at ms included an upward

sequence (BA on P) and ve downward sequences (BC

on P; AB, AB, and AC on P; and AB on P) in

addition to four homogenous sequences (BB on P; AA,

BB on P; CC on P) and six horizontal sequences (AA

and BB on P; BB on P; and AA, AA, and BB on

P). Among the  heterogeneous patterns, only two (BB

andBConP)wereleward.

e patterns extracted at ms and above had no seg-

mentsinrowsDandEandnosegmentsinthehcolumn.

None of the seven upward and downward sequences were

strictly vertical, involving adjacent or nonadjacent columns

in the ratio of  to . ese vertical patterns mostly involved

adjacent rows ( out of ).

Some of the constituent segments of the sequences at

ms and above appeared solely as prexes (A on P and P;

A on P) or as postxes (B on P; B and C on P; A, B,

B, and C on P; B and C on P).

e new double-chunk patterns found at ms had (a)

segments in row D and in column , (b) notable positions of

the new segments, (c) increased heterogeneous patterns, (d)

increased sequences between nonadjacent rows, (e) strictly

verticalsequences,and(f)bilateralsequencepairs.eseg-

ments in row D appeared only as postxes in the downward

sequences (D and D on P and P; D, D, and D on P;

and D on P). Similarly, the new segments found in row C

werepostxes(CandConP;ConP;ConP;andC

on P) with a single exception (C on P). e new segments

in row B were mostly postxes: B, B, and B on P, B on

P,andBandBonP.BandBonPwereprexes.An

interesting case was B on P which was special, being a prex

to itself (BB). Dual roles were more notable than unary ones

among the new segments in row A (A and A on P, A on

P, and A on P).

A total of seven new upward sequences were found, three

on P and two on both P and P, but still none on P. ese

were prexed by B (on P and P), B (P), or C (P) and

postxed by the segments in row A—A, A, A, or A. Only

T : Double-chunk patterns by direction.

Page Direction Pattern

P

↑ BA BA

BA

== B2B2 BB

↔ AA

BB

AA

AA

BB

BB

BB

BB

↓ AD

BC

BC

BD BD

BD

P

↑ BA

BA

== B3B3

AA BB BB

↔ BB

BB

BB

↓ BC

BC BD

BD

P

↑ BA

BA CA

== AA BB BB CC

↔

A1A2

AA

AA

BB

AA

BB

BB

BB

CC

↓ AB

A2B3

AB

AC

AB

AB

AB

AC

AD

AB AB

AD AD

AD

BC

BD

BD CD

P

↑ (none)

== CC

↔ C3C4

AA

BB

BB

CC

↓ AB

AC

AB

BC

BC BC

CD

Note. e sequence directions are upward (↑), homogenous (==), horizontal

(↔), and downward (↓). Underscored patterns were extracted at ms. ose

extracted at  are also emphasized in italicized bold face. Leward and

rightward sequences are marked by superscripts

and

,respectively.

CA involved nonadjacent rows. A strictly vertical sequence

was present on each of P, P, and P—BA, BA, and

BA. e rest were rightward (BA and BA on P) or

lewardonPandP(BAonP;CAonP).

A total of ve new homogenous sequences were found on

P and P, one in row A (AA on P), three in row B (BB

on P and BB on P and P), and one in row C (CC on

P). Like those at ms and above, none of the constituents

were in columns  or .

A total of  new horizontal sequences were found on P

(twoinrowAandfourinrowB),P(twoinB),P(oneinA,

threeinB,andoneinC),andP(oneinA,twoinB,andone

in C). A and A appeared as a prex or as a postx, while

A appeared only as a postx. e same held for B, B, and

B, while B and B appeared only as postxes. C assumed

dual positions in CC on P and CC on P, both of which

were leward. e ratio of leward to rightward sequences

was  : ,  : ,  : , and  :  in the order of P, P, P, and P.

A total of  new downward sequences were found, six

on P, three on P,  on P, and six on P. e prexes

concentrated in rows A and B with two exceptions (CD on

PandCDonP).Incontrast,thepostxesconcentrated

in rows C and D with exceptions of ve patterns on P and one

onP.HalformoreofthedownwardpatternsonP,P,and

 Applied Computational Intelligence and So Computing

T:Isolateprimitivesbymslevel.

ms ms

ms

P

A C C E

A2 A B B C

C D

A A2 A B

P

A B C D A1 A3 B C D

A1 A3

P

(none) B C C D

A B C

P

A B5 C1 C5

D

B B B3 B5 C1

C C5 D D

A B3 B C5

Note. Primitives in bold face were persistent at two or three ms levels.

P involved nonadjacent rows (A-D/ and B-D/ on P; B-D/

on P; and A-C/, A-D/, and B-D/ on P, where  denotes

the number of cases), whereas only AC out of six patterns

did so on P. e strictly vertical patterns were limited to

columns  and  (B-D/ on P; B-C/ and B-D/ on P; A-

B/, A-D/, and B-D/ on P; and B-C/ and C-D/ on P). e

rest were rightward on P and P, leward on P, or mixed

on P.

Among all of the patterns in Table , the heterogeneous

sequences were mostly unilateral in that the symmetric pairs

were limited in number (BB-BB on P; BB-BB on P;

AA-AA, AB-BA, AC-CA, and BB-BB on

P; and none on P). Four of these were horizontal sequences.

e constituents were limited to a subset consisting of the rst

three rows and columns, that is, {A2, B1,B2,B3, and C3}.

e individual constituents of the multichunk patterns

were frequent by themselves as primitive patterns at a given

ms level, but not vice versa. Table  lists the isolate primitive

patterns not participating in any multichunk pattern at a

given ms level. While the number of total primitive patterns

monotonically decreased from ms to ms, the ratio of

the isolate primitive patterns to the total primitive patterns

monotonically increased on all pages almost perfectly. e

ratios at ms12,14,16 were 4/17,7/11,4/5, 4/13,5/8,2/4,

0/13,4/11, 3/6,and5/17,9/14,4/6, in the order of P, P,

P, and P. e sole exception was the second and the third

ratiosonP.erewerenoisolatesonPatms.

Generally, an isolate primitive at a given ms level would

become a member of sequence(s) at a lower level and would

notbepresentatahigherlevel.Exceptionally,C,located

in the rightmost column, persisted on P as an isolate at all

ms levels. Partial persistence was observed between ms and

ms on P (A), P (A, A), and P (B) as well as between

ms and ms on P (B, C). No persistence was observed

on P. e persistent ones on P and P were limited to the

rst three columns of the top row, {A1,A2,A3},whereasthose

on P spread over rows B and C in columns , , and , that is,

{B3,B5, C1,C5}.

Finally, E on P at ms was the sole frequent segment in

the bottom row E where segments were generally infrequent

across pages at all ms levels.

4. Discussion

Eye-tracking researchers have inferred a xation from gaze

points closely clustered in space and time, treating it as a

meaningful unit of information processing, that is, a chunk,

a familiar concept in psychology. Chunking of lower-level

chunks into a higher one is not uncommon as seen in the rela-

tionships letter,word,phrase,sentence,paragraph,....e

present paper examined the patterns of second-order chunks,

that is, chunks of xations, using isolate gaze point(s) not

participating in any xation as the delimiter. e delimiter

wasassumedtoplayanauxiliaryroleinchunking,likea

pause in speech.

Most of the identied chunks were short, consisting of

one or two xations. Also, the transitions within multixation

chunks and between chunks were mostly short in distance,

either loops or one-block shis to adjacent segments. ese

seem to be attributable to the minimum criterion of the deli-

miter we employed—at least one isolate gaze point. Hence,

even an accidental dislocation of one’s gaze resulted in chunk-

ing. It would be ideal if we could separate cognitively mean-

ingful chunking from accidental chunking. Until an eective

method is established, the best we can do is to be cautious in

interpreting the results.

Actually, setting an appropriate criterion is a dicult

task due to the possible individual and situational variations.

Perhaps individuated criteria will be appropriate instead of a

uniform criterion. Further investigation of the distributions

of gaze points participating in xations and those that are

isolated is necessary.

As reported earlier, within- and between-chunk transi-

tions were similar in that the rst two modal distances were

zero (i.e., loops) and one block. However, these diered in

order and in magnitude. Loops were primary among within-

chunk transitions but secondary among between-chunk tran-

sitions. e opposite was true for the one-block shis. Next,

theproportionsoftheprimaryandsecondarydistancesofthe

within-chunk transitions exceeded the respective propor-

tions pertaining to the between-chunk transitions. Similarly,

there were more long-distance shis between chunks than

within them.

eseresultsseemtosuggestthattheattentionofoursub-

jects was most likely shied, aer a pause, to an adjacent seg-

mentoneblockawayorwithinthesamesegment.emed-

ium or long-distance shis were also separated by pauses,

though their proportions were smaller than the short ones.

Shis without a pause, that is, within-chunk shis, were short,

chiey occurring in the same segment or between adjacent

segments one block away.

Now we turn to a discussion of the frequent patterns (i.e.,

subsequences) extracted by PrexSpan. e patterns were

simple in structure, mostly consisting of single or double

chunks. Furthermore, the chunks themselves contained sin-

gle xations or single loops as expected from the chunk pro-

perties discussed above. More complex structures might have

resulted if we had employed less stringent criteria for the

delimiter. Even so, beneath the structural simplicity, interest-

ing properties emerged as to the segment dierentiation and

the directional unevenness in attentional shis.

First, the within-chunk loops were limited to (A..), (B..),

and (D..), all of which were in the lemost column. While

thepresenceof(D..)wasquitelimited,theleadingrolesof

(A..) and (B..) as prexes in the multichunk sequences are

Applied Computational Intelligence and So Computing 

noteworthy. ese roles might be attributable to menu items

placed in the segments. Second, the multichunk sequences

chiey consisted of the segments in rows A, B, and C. In

particular,theleadingroleofAonPandPwasnoteworthy,

liketheloop(A..),thoughitsdualroleaspre-andpostxwas

observed on P. In contrast, A, B, and C were consistently

positionedaspostxes.esameheldforthesegmentsinrow

D, which appeared only at the lowest ms level. e segments

in row E were totally absent in multichunk sequences.

ird, the sequences at ms and ms were more likely to

be horizontal, including homogenous codes, than downward

and, to much less extent, than the upward sequence, which

remained least likely among the additional patterns found at

ms. e order between horizontal and downward sequ-

ences varied across pages at ms.

By chunking eye-tracking records into smaller units, we

discovered interesting properties of the eye movement of web

page viewers. However, further studies seem necessary to

enhance the present approach, for example, by setting up

nested AOI’s to reect the hierarchical structure of the web

objects [] and by adjusting the chunk delimiters to accom-

modate individual and task variations. Besides these rene-

ments, we are planning an application of mined frequent

patterns to simultaneous clustering []ofsubjectsandthe

properties of their eye movement and other relevant indices.

Appendix

We briey explain frequent sequential pattern mining by Pre-

xSpan (prex-projected sequential pattern mining) devel-

oped by Pei et al. [, ]. Interested readers should consult

the original articles for formal descriptions and evaluations

in comparison with other competing algorithms.

Let us use Table  as the DB (database) to be scanned.

It consists of four sequences whose elements are nonempty

subsets of items {,,,,,,}.Anelementiscomposed

of a set of items: , , , , , ,and. PrexSpan assumes

thatitemsinanelementarealphabeticallyorderedwithno

duplication, for example, [], [],and[].

e goal of PrexSpan is to nd subsequences fre-

quently shared among the records in DB. A subsequence is

dened as the list of nonempty subsets of the elements of

a given sequence, where the sequential order of elements is

preserved. For example, [][][][] is a subsequence of

[][][][][]. e threshold of frequent occurrence

is called the minimum support (abbreviated as ms in this

paper).Itsvalueistobespeciedbytheuser.

Subsequences of special importance are a prex and the

associated sux. For instance, a frequent item ,withms=3,

can serve as a prex of the ensuing pattern (i.e., the sux) to

bescannednext.epatternslistedinthesecondcolumn

of Table  are the sux sequences constituting the -

projected database. Similar databases are to be constructed

for every frequent item. With ms,  and

 together will

be considered frequent, where the underline

implies .

Hence,  will serve as a prex, yielding only the two suxes

[

][][][] and [][][].

efcb

aca

e

efc

(ab)dc

eac

adc

acc

eab

eacb

acb

abc

a(bc)a

(ab)f

aba

ad ca

(ab)

a(bc)

(bc)a

bdc

ebc

c

dcb

ecb

fcb



(bc) cc

(ab)c

(ab)d

F : Network of the frequent patterns extracted at ms (small

letters in dark blue), ms (large letters in dark red), and ms

(underscored). Note: [] is omitted for the single-code chunks and is

replaced by ()for the multiple-code chunks for the sake of simplicity.

See the rst column of Table  for the initial sequences.

e network of the frequent patterns extracted at ms =

2,3,andisillustratedinFigure  to help grasp the inclusive

relations among them in two senses: (a) nn element of a

frequent pattern is also frequent; and (b) a frequent pattern

at a given ms level is also frequent at a lower level.

More formally, a sequence  of length  is a prex of

another sequence of length (≤)consisting of frequent

elements in the database if and only if the rst −1elements

are identical; the last element of  is a subset of the th

element of .

e sux of  with regard to  is a sequence, the rst

element of which is the dierence between the th elements

of  and . e remaining elements of the sux are identical

with the ( + 1)th to the last element of ;thatis,

element

1|sux

= element

𝑚|𝛼

− element

𝑚|𝛽

;

(A.)

if <

element

𝑗|sux

= element

𝑚+𝑗−1|𝛼

=2,...,−+1.

(A.)

Scanning with respect to the prex stops when the sux

becomes nil ( = ) or no frequent item exists in the

projected database. is process is executed in a depth-rst

manner for every code initially identied as frequent.

It must be noted that some of the extracted patterns may

be hard to identify in the original sequences, due to the

intermittent removal of infrequent items from the projected

database during the process, for example, the extracted pat-

tern [][][] in Table  and the sequence [][][][][]

in Table . is point should be clear to those who are familiar

 Applied Computational Intelligence and So Computing

with masking (or wildcard) characters, such as an asterisk

“∗” in string matching. One can nd original patterns by

attaching a masking character to the extracted patterns.

Conflict of Interests

e authors declare that there is no conict of interests

regarding the publication of this paper.

References

[] S.Martinez-Conde,S.L.Macknik,andD.H.Hubel,“eroleof

xational eye movements in visual perception,” Nature Reviews

Neuroscience,vol.,no.,pp.–,.

[] D. D. Salvucci and J. H. Goldberg, “Identifying xations and

saccades in eye-tracking protocols,” in Proceedings of the Eye

Tracking Research and Applications Symposium,pp.–,

November .

[] G.A.Miller,“emagicalnumberseven,plusorminustwo:

some limits on our capacity for processing information,” Psy-

chological Review,vol.,no.,pp.–,.

[] D.Melcher,“Dynamic,object-basedremappingofvisualfea-

tures in trans-saccadic perception,” Journal of Vision,vol.,no.

, article , .

[] D. Melcher, “Selective attention and the active remapping of

object features in trans-saccadic perception,” Vision Research,

vol. , no. , pp. –, .

[] J. Ross, M. C. Morrone, M. E. Goldberg, and D. C. Burr, “Chan-

ges in visual perception at the time of saccades,” Trends in Neu-

rosciences, vol. , no. , pp. –, .

[] E. Cutrell and Z. Guan, “What are you looking for?: an

eye-tracking study of information usage in Web search,” in

Proceedings of the 25th SIGCHI Conference on Human Factors in

Computing Systems (CHI ’07), pp. –, May .

[] N. Matsuda and H. Takeuchi, “Networks emerging from shis

of interest in eye-tracking records,” eMinds,vol.,no.,pp.–

, .

[] N. Matsuda and H. Takeuchi, “Joint analysis of static and

dynamic importance in the eye-tracking records of web page

readers,” Journal of Eye Movement Research,vol.,no.,article

,  pages, .

[] N. Matsuda and H. Takeuchi, “Do heavy and light users dier

in the Web-page viewing patterns? Analysis of their eye-track-

ing records by heat maps and networks of transitions,” Interna-

tional Journal of Computer Information Systems and Industrial

Management Applications,vol.,pp.–,.

[] J. H. Goldberg and X. P. Kotval, “Computer interface evaluation

using eye movements: methods and constructs,” International

Journal of Industrial Ergonomics,vol.,no.,pp.–,.

[] J.D.McCarthy,M.A.Sasse,andJ.Riegelsberger,“egeometry

of web search,” in People and Computers XVIII—Design for Life,

pp.–,Springer,London,UK,.

[] J. Pei, J. Han, B. Mortazavi-Asl et al., “PrexSpan: min-

ing sequential patterns eciently by prex-projected pattern

growth,” in Proceedings of the 17th International Conference on

Data Engineering,pp.–,April.

[] J. Pei, J. Han, B. Mortazavi-Asl et al., “Mining sequential patterns

by pattern-growth: e PrexSpan approach,” IEEE Transac-

tionsonKnowledgeandDataEngineering,vol.,no.,pp.

–, .

[] R. Agrawal and R. Srikant, “Fast algorithms for mining asso-

ciation rules,” in Proceedings of the International Conference on

Very Large Data Bases (VLDB '94),pp.–,.

[] D.I.Brooks,I.P.Rasmussen,andA.Hollingworth,“enesting

of search contexts within natural scenes: evidence from con-

textual cuing,” Journal of Experimental Psychology: Human

Perception and Performance, vol. , no. , pp. –, .

[] A. Preli

c, S. Bleuler, P. Zimmermann et al., “A systematic com-

parison and evaluation of biclustering methods for gene expres-

sion data,” Bioinformatics,vol.,no.,pp.–,.

Submit your manuscripts at

http://www.hindawi.com

Computer Games

Technology

International Journal of

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Distributed

Sensor Networks

International Journal of

Advances in

Fuzzy

Systems

Hindawi Publishing Corporation

http://www.hindawi.com

Volume 2014

International Journal of

Reconﬁgurable

Computing

Hindawi Publishing Corporation

http://www.hindawi.com

Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Applied

Computational

Intelligence and Soft

Computing

Advances in

Articial

Intelligence

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Advances in

Software Engineering

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Electrical and Computer

Engineering

Journal of

Computer Networks

and Communications

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporation

http://www.hindawi.com

Volume 2014

Articial

Neural Systems

Advances in

Hindawi Publishing Corporation

http://www.hindawi.com

Volume 2014

Robotics

Journal of

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Computational

Intelligence and

Neuroscience

Industrial Engineering

Journal of

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Modelling &

Simulation

in Engineering

Hindawi Publishing Corporation

ttp://www.hindawi.com

Volume 2

014

The Scientic

World Journal

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014

Human-Computer

Interaction

Advances in

Computer Engineering

Advances in

Hindawi Publishing Corporation

http://www.hindawi.com Volume 2014