Research Article
Frequent Pattern Mining of Eye-Tracking Records
Partitioned into Cognitive Chunks
Noriyuki Matsuda
1
and Haruhiko Takeuchi
2
1
Department of Social Systems & Management, University of Tsukuba, Tsukuba 305-8573, Japan
2
National Institute of Advanced Industrial Science & Technology (AIST), Tsukuba 305-8566, Japan
Correspondence should be addressed to Haruhiko Takeuchi; [email protected].jp
Received  July ; Accepted  October ; Published  November 
Academic Editor: Yongqing Yang
Copyright ©  N. Matsuda and H. Takeuchi. is is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Assuming that scenes would be visually scanned by chunking information, we partitioned xation sequences of web page viewers
into chunks using isolate gaze point(s) as the delimiter. Fixations were coded in terms of the segments in a 5×5mesh imposed
on the screen. e identied chunks were mostly short, consisting of one or two xations. ese were analyzed with respect to
the within- and between-chunk distances in the overall records and the patterns (i.e., subsequences) frequently shared among the
records. Although the two types of distances were both dominated by zero- and one-block shis, the primacy of the modal shis was
less prominent between chunks than within them. e lower primacy was compensated by the longer shis. e patterns frequently
extracted at three threshold levels were mostly simple, consisting of one or two chunks. e patterns revealed interesting properties
as to segment dierentiation and the directionality of the attentional shis.
1. Introduction
Eyes seldom stay completely still. ey continually move even
when one tries to xate ones gaze on an object because of the
tremors, dris, and microsaccades that occur on a small scale
[]. Hence, researchers need to infer a xation from consecu-
tive gaze points clustered in space []. We may regard such a
cluster of gaze points as a perceptual chunk, a familiar term in
psychology aer Miller [] in referring to a practically mean-
ingful unit of information processing.
During xation, people closely scan a limited part of the
scene they are interested in. ey then quickly move their eyes
to the next xation area by saccade, which momentarily dis-
rupts vision. However, it normally goes unnoticed thanks to
our vision system that produces continuous transsaccades
perception [].Itmeansthatsuccessivexationsconstitute
a higher order chunking over and above the primary chunk-
ing of gaze points. Put metaphorically, the gaze,xation,
xation-chunking relationship is analogous to the letter,
word,phrase relationship. For the sake of brevity, a chunk
of xations will be referred to as a chunk.
In viewing natural scenes or displays, a chunk continues
to grow until interrupted by one or more isolate gaze points
resulting from driing attention or by accident. ese do not
participate in any xation. Whatever causes the interruption,
we believe that such isolate points serve as chunk delimiters,
likethepausesinspeech.Asapausecanbeeithershortor
long, interruptions by isolate points can vary in length. Figure
illustrates two levels of chunking: (a) chunking of gaze
points into xations and (b) chunking of consecutive xations
with and without interruption.
Granting our conjecture, one may still wonder what
particular merits will accrue from the analysis of chunks in
lieu of ordinary plain xation sequences. e expected merits
are twofold: separation of between- and within-chunk patterns
and extraction of common patterns across records. Neither of
these is attainable when dealing with multiple records by heat
maps of xations accumulated with no regard to sequential
connections [], by network analysis of the adjacent transi-
tions accumulated within and between records [], or by
scan paths that would be too complicated [] unless reduced
to frequently shared subpaths. e key to understanding this
Hindawi Publishing Corporation
Applied Computational Intelligence and So Computing
Volume 2014, Article ID 101642, 8 pages
http://dx.doi.org/10.1155/2014/101642
Applied Computational Intelligence and So Computing
F
1
F
2
(a)
F
3
F
4
(b)
F : Two xations in one chunk (a) and in separate chunks (b).
point lies in the structure of xation sequences as explained
below.
1.1. Structure of Fixation Sequences. Equation () presents two
types of xation sequences, one plain and the other parti-
tioned,botharrangedintimesequence.eformerservesas
a basis for heat maps, scan paths, and network analysis. e
latter incorporates chunks delimited by isolate gazes or any
other appropriate criterion. e essential nature of the sequ-
ences remains the same when xations are coded in areas of
interest (AOI) or grid-like segments.
Plain and Partitioned Fixation Sequences. Consider
Plain:
1
2
3
⋅⋅⋅
𝑖
𝑖+1
𝑖+2
⋅⋅⋅
Partitioned: 
1
2
,
3
,...,
𝑖
𝑖+1
𝑖+2
⋅⋅⋅
,...,
()
where
𝑖
denotes the th xation.
Although it was not explicitly stated, McCarthy et al. []
in eect extracted chunks from partitioned sequences in their
work on the importance of web page objects and their loc-
ations. ey grouped consecutive xations within each AOI
into a chunk called a glance to obtain plain sequences of
glances coded in AOI. eir interest was to see how oen
areas of web pages would attract glances by varying the area
locations and the types of tasks.
By focusing on the frequency of glances as an indication
of importance, they disregarded the length of the chunks, that
is, the number of xations within glances. Also disregarded
was the shi of glances, that is, between-chunk sequences. To
us, both within- and between-chunk patterns seem to contain
rich information worthy of investigation. e information
can be extracted from partitioned sequences but not from
plain ones. In addition, partitioned sequences will be of great
value when some AOIs are nested into broader AOIs (see
[]), given the appropriate coding. e present study is
extensible to such a hierarchical structure.
For the sake of simplicity, we will focus on the eye move-
ments of web page viewers, and we will assume that the pages
are divided into grid-like AOIs, that the xations are coded
in terms of the areas in which they fall, and that chunks are
delimited by isolate gaze points.
1.2. Shis of Interest within and between Chunks. e distance
between two successive xations indicates how far the interest
T : Pattern extraction by prex ”atms.
Record
Initial sequences
Patterns
prexed by
[][][][][] [][][][]
[][][][] [
][][][]
[][][][][] [
][][][]
[][][][][][] [
][][][]
frequent code
/4,/4/,/4,/4,
/3,/3
/4,/4
Note.eunderscore means that the prex was present in the chunk; for
example,
𝑏 implies 𝑎𝑏.
shied or did not shi in a looped transition that represents
sustained interest in a given area. In our view, a chunk of x-
ations reects continuous interest, and a new one begins aer
a momentary dri of the gaze. It seems natural to expect the
distance distribution of the within-chunk shis to dier, to
some extent, from that of the between-chunk shis.
e distance analysis explained above exploits informa-
tion from the cumulative records across all viewers. Hence, it
is possible that the results are inuenced by some dominant
patterns in particular records. If one is interested in sequential
regularities oen shared among records, frequent sequential
pattern mining is useful, as explained below.
1.3. Frequent Sequential Pattern Mining. Among others, we
will employ PrexSpan, developed by Pei et al. [, ],
because of its conceptual compatibility with the partitioned
sequences of eye-tracking data. eir approach is briey
explained below using their example, seen in Table .(Seethe
Appendix for a more formal explanation.) One can view the
data as the eye-tracking records of four viewers in which x-
ations are alphabetically coded according to the areas of
interest (AOI) they fall into: , , , , , ,and.
Codes , , , , ,and areallfrequent,sharedbythe
majority, whereas code is infrequent, appearing only once.
For further scanning, any infrequent or rare code is to be
removed from the records, since it will never appear in frequ-
ent patterns according to the aprioriprinciple []. Let us
set the level of being frequent at three for illustrative pur-
poses. is level is called the minimum support threshold
(abbreviated as ms).
Applied Computational Intelligence and So Computing
T : All of the frequent patterns extracted from Table at ms.
[a][a][b][a][c] [][][] [][][] [b] [][]
[c
] [][] [][] [d ] [][] [] []
Note. Underscored patterns were extracted at ms.
For every frequent code, one scans the reduced records,
devoid of infrequent codes, for patterns prexed by the given
code. ose found for prex are listed in the second
column of Table .esearesubjecttofurtherscanningwith
respect to the frequent codes at this step, that is, and .
is process recursively continues until no code is frequent or
no patterns remain in the records. Note that prexes grow in
each step like [][]”, [][] in the above example (see the
Appendix for a more formal explanation).
Table lists  frequent patterns extracted at ms from the
initial record, including those found at ms as an embedded
part. For instance, [][], found at ms, is embedded in
patterns [][] and [][][],atms;thatis,
[
][
]
{
[
][
]
,
[
][
][
]
}
.
()
Similarly, those found at ms are included in the patterns at
ms reported by Pei et al. [, ]. Inclusive relations generally
hold between dierent ms levels.
Ordinarily, one nds too few patterns at a high ms level
andtoomanyatalowleveltomakeaninterestinganalysis.
However, once one recognizes the inclusive relations, making
use of multiple levels becomes a plausible solution for iden-
tifying strongly frequent patterns as opposed to mildly and
weakly frequent ones. (See the Appendix for the relation
networks among the patterns identied at ms, ms, and
ms.)
e present approach is expected to advance eye-tracking
research along with conventional heat maps, scan paths, and
network analysis recently developed by Matsuda and
Takeuchi [].
2. Method
2.1. Subjects (Ss). Twenty residents (seven males and 
females) living near the AIST Research Institute in Japan were
recruited for the experiments. ey had normal or corrected
vision, and their ages ranged from  to years (average ).
Ten of the Ss were university students, ve were housewives,
and the rest were part-time workers. Eleven Ss were heavy
Internet users, while the rest were light users, as judged from
their reports about the number of hours they spent browsing
online in a week.
2.2. Stimuli. e front (or top) pages of ten commercial web
sites were selected from various business areas: airline com-
panies, commerce and shopping, and banking. ese were
classiable into three groups according to the layout types [
]. Due to space limits, we chose four pages with the same
layout,thetopandtheprincipallayers.eprincipallayers
were divided into the main area in the middle and subareas
on both sides. e layers and the areas diered in size among
pages.
A1 A2 A3 A4 A5
B1 B2 B3 B4 B5
C1 C2 C3 C4 C5
D1
D2 D3 D4 D5
E1 E2 E3 E4 E5
F : Segment coding.
2.3. Apparatus and Procedure. e stimuli were presented
with  × pixel resolution on a TFT 
󸀠󸀠
display in a Tobii
 eye-tracking system at a rate of  Hz. e web pages
were randomly displayed to the Ss one at a time for sec. e
Ss were asked to browse each page at their own pace. e
translated instructions are “Various web pages will be shown
on the computer display in turn. Please look at each page as
you usually do until the screen darkens and then, click the
mousebuttonwhenyouarereadytoproceed.”eSswere
informed that the experiment would last for approximately
ve minutes.
2.4. Segment Coding. A× meshwassuperposedonthe
eective part of each page, aer the page was stripped of white
margins that had no text or graphics. A uniform mesh was
employedforeaseofcomparisonamongpagesthatvariedin
design beyond the basic layout. e distance of a shi
between two segments was measured by the Euclidean dist-
ance, computed as the square root of
2
+
2
,where and
are the number of blocks (i.e., segments) moved along the
horizontal and vertical axes.
e rows (and columns) of the mesh were alphabetically
(and numerically) labeled in descending order: A through E
(and through ). e segments were coded by combining
these labels as seen in Figure :A1,A2,...,A5 for the rst
row; B1,...,B5 for the second; and so on through E1,...,E5
for the h row.
2.5. Fixation Sequences. e raw tracking data for each sub-
jectconsistedoftime-stampedgazepointsmeasuredin-
coordinates. e gaze points were grouped into a xation
pointiftheystayedwithinaradiusofpixelsformsec.
Otherwise, they remained isolate.
Each xation was then translated into code sequences
according to the segments in which the xation fell. Finally,
each xation sequence was partitioned into chunks using the
isolate gaze points as delimiters.
Applied Computational Intelligence and So Computing
2.6. Preprocessing the Codes for PrexSpan. In accord with
the algorithm,  segments were rst recoded using letters
through ; then the codes in each chunk were alphabetically
ordered with no duplication. In this process, we repre-
sented within-chunk loops by extra recoding. Consecutively
repeated codes within a chunk were replaced by the corre-
sponding capital letter, for example, [caaababaa] to [cAbabA].
Aer eliminating duplicates, we sorted the codes within each
chunk, for example, [Aabc] from the original sequence. Con-
sequently, we maintained the sequential order among chunks,
but the within-chunk sequences could have been distorted.
Due to this possibility, we were unable to identify between-
chunk loops.
Frequent patterns were extracted at three levels of mini-
mum support (denoted as ms, ms, and ms) correspond-
ing to , , and % of the subjects.
3. Results
e four pages used as stimuli will be referred to as P, P, P,
and P.
3.1. Examination of the Chunks. e total number of chunks
did not greatly dier among pages, ranging from  (P) to
 (P). e pages agreed well on the lengths and pro-
portions of primary, secondary, and tertiary chunks that
contained one, two, and three xations, respectively. Primary
chunks accounted for . (P) to .% (P) of the total
chunks, and secondary chunks accounted for . (P) and
.% (P). Putting the primary and secondary chunks
together, the vast majority of the chunks (.%) were very
short. e proportions of the tertiary chunks were much
smaller, ranging from . (P) to .% (P). e longer
chunks accounted for . (P) to .% (P).
e primary shis of transitions within double-xation
chunks were loops (distance = ) across pages. ese
accounted for . (P) to .% (P). e pages agreed also
on the secondary (
1)andtertiary(
2)distances,which
involved adjacent segments connected laterally (or vertically)
anddiagonally,respectively.eproportionoftheformer
ranged from . (P) to .% (P). In contrast, that of
the latter was much smaller (.%). Put together, the over-
whelming majority of the double-xation chunks (.%)
were homogenous, that is, loops, or minimally heterogeneous
(
1).
Loops and one-block shis were also dominant among
the chunks of length three or more. Loops accounted for
. (P) to .% (P) of the shis, and one-block shis
accounted for . (P) to .% (P) of them. Putting these
together, the overwhelming majority (.%) of the shis
within longer chunks were extremely short in distance.
Similarly, extremely short shis (
) were modal among
between-chunk transitions in reverse order and less promi-
nent than within-chunk transitions. Primary one-block shis
accounted for . (P) to .% (P) of the total between-
chunk shis, and loops accounted for . (P) to .% (P).
eir combined proportions ranged from . (P) to .%
(P).
T : Number of patterns () by length (len) by ms.
len
ms ms ms
Loops Loops Loops
P
 B/

P
 A/, B/, D/ B/ B/
 B/ B/

P
A/  A/ A/
 A/  A/ A/
 A/
P
 A/  A/
 A/
Note. e length of a pattern (len) is the number of constituent chunks. Also
listed are the identied within-chunk loops with the number of patterns in
which they appeared.
e low prominence of the rst two modal shis was
compensated by the relatively large proportions of the longer
ones. Each of the two-block shis (
2 and
4) exceeded %
levels on all pages with the exception of .% (
2)onP.
Compared to the paucity of shis of three blocks (
5)within
chunks (.%), the corresponding distance between chunks,
which ranged from . (P) to .% (P), was noteworthy.
Similarly noticeable was the size of the long-distance shis
(
), which ranged from . (P) to .% (P), while such
shis were nonexistent or negligible (.% on P) within
chunks.
3.2. Examination of the Frequent Patterns. e frequent
patterns extracted at three dierent ms levels (ms, ms,
andms)areinclusivewithineachpageinthesensethat(a)
subpatterns of a frequent pattern are also frequent at a given
level and (b) the patterns extracted at a higher level are
included in those at a lower level. For the sake of simplicity,
the term frequent will be omitted below when obvious.
Prior to mining, special coding was applied to the within-
chunk loops as explained in Section .
As seen in Table , the patterns were generally short,
consisting of one or two chunks across pages at all ms levels.
e longer ones (one on P and ve on P), all of length three,
were found only at ms. e constituent chunks were simple
in composition, being a single xation or a single loop. e
loops were limited to (AA), (BB), and (DD), all located
intherstcolumnofthemesh.(esewillbedenotedas
(A..), (B..), and (D..).) e (D..) loop appeared only on P
at ms by itself, unaccompanied by any other chunk. (B..)
appeared alone on P at ms and on P at all ms levels. Also,
it was paired with B on P both as a prex (ms) and as a
postx (ms and ms). (A..) appeared by itself on P
(ms), P (ms, ms, and ms), and P (ms and ms)
andalsoasaprextootherchunk(s)onP(ms,ms,and
ms) and P (ms). None of the corresponding segments
were in the rst column. e postxes on P were A (ms,
ms, and ms); A and B (ms and ms); and B, B, B,
Applied Computational Intelligence and So Computing
B, C, and D (ms) in addition to AA and BB (ms).
oseonPwereB,C,C,D,andD(ms).
In the six patterns of length three found on P and
P at ms, the constituent codes were partially or totally
homogenous. Five of them contained two repeated codes,
either A or B, including those prexed by (A..) as reported
above. e remaining one, found solely on P, contained A.
In the following examination of the double-chunk patterns,
loops will be treated as single codes to reduce complexity.
e double-chunk patterns are listed in Table by the
direction of the sequences—upward, homogenous, horizon-
tal, and downward. Superscripts L and R denote leward and
rightward sequences. Underscored patterns were extracted
atmsandabove.osefoundonlyatmsarefurther
emphasized in italicized bold face. e total number of
patterns varied from  (on P and P) to (P).
At ms, the patterns were homogenous (BB on P;
BB on P), horizontal (AA on P; CC on P), or
downward (AB on P) sequences with the exception of
down-rightward pattern AB on P. ere was no leward
heterogeneous pattern.
e new patterns found at ms included an upward
sequence (BA on P) and ve downward sequences (BC
on P; AB, AB, and AC on P; and AB on P) in
addition to four homogenous sequences (BB on P; AA,
BB on P; CC on P) and six horizontal sequences (AA
and BB on P; BB on P; and AA, AA, and BB on
P). Among the heterogeneous patterns, only two (BB
andBConP)wereleward.
e patterns extracted at ms and above had no seg-
mentsinrowsDandEandnosegmentsinthehcolumn.
None of the seven upward and downward sequences were
strictly vertical, involving adjacent or nonadjacent columns
in the ratio of to . ese vertical patterns mostly involved
adjacent rows ( out of ).
Some of the constituent segments of the sequences at
ms and above appeared solely as prexes (A on P and P;
A on P) or as postxes (B on P; B and C on P; A, B,
B, and C on P; B and C on P).
e new double-chunk patterns found at ms had (a)
segments in row D and in column , (b) notable positions of
the new segments, (c) increased heterogeneous patterns, (d)
increased sequences between nonadjacent rows, (e) strictly
verticalsequences,and(f)bilateralsequencepairs.eseg-
ments in row D appeared only as postxes in the downward
sequences (D and D on P and P; D, D, and D on P;
and D on P). Similarly, the new segments found in row C
werepostxes(CandConP;ConP;ConP;andC
on P) with a single exception (C on P). e new segments
in row B were mostly postxes: B, B, and B on P, B on
P,andBandBonP.BandBonPwereprexes.An
interesting case was B on P which was special, being a prex
to itself (BB). Dual roles were more notable than unary ones
among the new segments in row A (A and A on P, A on
P, and A on P).
A total of seven new upward sequences were found, three
on P and two on both P and P, but still none on P. ese
were prexed by B (on P and P), B (P), or C (P) and
postxed by the segments in row A—A, A, A, or A. Only
T : Double-chunk patterns by direction.
Page Direction Pattern
P
BA BA
R
BA
R
== B2B2 BB
AA
R
BB
R
AA
R
AA
R
BB
L
BB
R
BB
L
BB
R
AD
R
BC
R
BC
R
BD BD
R
BD
P
BA
L
BA
== B3B3
AA BB BB
BB
L
BB
R
BB
L
BC
L
BC BD
L
BD
P
BA
R
BA CA
L
== AA BB BB CC
A1A2
R
AA
R
AA
R
BB
R
AA
L
BB
R
BB
L
BB
R
CC
L
AB
R
A2B3
R
AB
R
AC
R
AB
R
AB
R
AB
R
AC
R
AD
R
AB AB
R
AD AD
R
AD
R
BC
R
BD
L
BD CD
L
P
(none)
== CC
C3C4
R
AA
R
BB
R
BB
R
CC
L
AB
AC
R
AB
R
BC
R
BC BC
R
CD
Note. e sequence directions are upward (), homogenous (==), horizontal
(), and downward (). Underscored patterns were extracted at ms. ose
extracted at  are also emphasized in italicized bold face. Leward and
rightward sequences are marked by superscripts
L
and
R
,respectively.
CA involved nonadjacent rows. A strictly vertical sequence
was present on each of P, P, and P—BA, BA, and
BA. e rest were rightward (BA and BA on P) or
lewardonPandP(BAonP;CAonP).
A total of ve new homogenous sequences were found on
P and P, one in row A (AA on P), three in row B (BB
on P and BB on P and P), and one in row C (CC on
P). Like those at ms and above, none of the constituents
were in columns or .
A total of  new horizontal sequences were found on P
(twoinrowAandfourinrowB),P(twoinB),P(oneinA,
threeinB,andoneinC),andP(oneinA,twoinB,andone
in C). A and A appeared as a prex or as a postx, while
A appeared only as a postx. e same held for B, B, and
B, while B and B appeared only as postxes. C assumed
dual positions in CC on P and CC on P, both of which
were leward. e ratio of leward to rightward sequences
was : , : , : , and : in the order of P, P, P, and P.
A total of  new downward sequences were found, six
on P, three on P,  on P, and six on P. e prexes
concentrated in rows A and B with two exceptions (CD on
PandCDonP).Incontrast,thepostxesconcentrated
in rows C and D with exceptions of ve patterns on P and one
onP.HalformoreofthedownwardpatternsonP,P,and
Applied Computational Intelligence and So Computing
T:Isolateprimitivesbymslevel.
ms ms
ms
P
A C C E
A2 A B B C
C D
A A2 A B
P
A B C D A1 A3 B C D
A1 A3
P
(none) B C C D
A B C
P
A B5 C1 C5
D
B B B3 B5 C1
C C5 D D
A B3 B C5
Note. Primitives in bold face were persistent at two or three ms levels.
P involved nonadjacent rows (A-D/ and B-D/ on P; B-D/
on P; and A-C/, A-D/, and B-D/ on P, where denotes
the number of cases), whereas only AC out of six patterns
did so on P. e strictly vertical patterns were limited to
columns and (B-D/ on P; B-C/ and B-D/ on P; A-
B/, A-D/, and B-D/ on P; and B-C/ and C-D/ on P). e
rest were rightward on P and P, leward on P, or mixed
on P.
Among all of the patterns in Table , the heterogeneous
sequences were mostly unilateral in that the symmetric pairs
were limited in number (BB-BB on P; BB-BB on P;
AA-AA, AB-BA, AC-CA, and BB-BB on
P; and none on P). Four of these were horizontal sequences.
e constituents were limited to a subset consisting of the rst
three rows and columns, that is, {A2, B1,B2,B3, and C3}.
e individual constituents of the multichunk patterns
were frequent by themselves as primitive patterns at a given
ms level, but not vice versa. Table lists the isolate primitive
patterns not participating in any multichunk pattern at a
given ms level. While the number of total primitive patterns
monotonically decreased from ms to ms, the ratio of
the isolate primitive patterns to the total primitive patterns
monotonically increased on all pages almost perfectly. e
ratios at ms12,14,16 were 4/17,7/11,4/5, 4/13,5/8,2/4,
0/13,4/11, 3/6,and5/17,9/14,4/6, in the order of P, P,
P, and P. e sole exception was the second and the third
ratiosonP.erewerenoisolatesonPatms.
Generally, an isolate primitive at a given ms level would
become a member of sequence(s) at a lower level and would
notbepresentatahigherlevel.Exceptionally,C,located
in the rightmost column, persisted on P as an isolate at all
ms levels. Partial persistence was observed between ms and
ms on P (A), P (A, A), and P (B) as well as between
ms and ms on P (B, C). No persistence was observed
on P. e persistent ones on P and P were limited to the
rst three columns of the top row, {A1,A2,A3},whereasthose
on P spread over rows B and C in columns , , and , that is,
{B3,B5, C1,C5}.
Finally, E on P at ms was the sole frequent segment in
the bottom row E where segments were generally infrequent
across pages at all ms levels.
4. Discussion
Eye-tracking researchers have inferred a xation from gaze
points closely clustered in space and time, treating it as a
meaningful unit of information processing, that is, a chunk,
a familiar concept in psychology. Chunking of lower-level
chunks into a higher one is not uncommon as seen in the rela-
tionships letter,word,phrase,sentence,paragraph,....e
present paper examined the patterns of second-order chunks,
that is, chunks of xations, using isolate gaze point(s) not
participating in any xation as the delimiter. e delimiter
wasassumedtoplayanauxiliaryroleinchunking,likea
pause in speech.
Most of the identied chunks were short, consisting of
one or two xations. Also, the transitions within multixation
chunks and between chunks were mostly short in distance,
either loops or one-block shis to adjacent segments. ese
seem to be attributable to the minimum criterion of the deli-
miter we employed—at least one isolate gaze point. Hence,
even an accidental dislocation of ones gaze resulted in chunk-
ing. It would be ideal if we could separate cognitively mean-
ingful chunking from accidental chunking. Until an eective
method is established, the best we can do is to be cautious in
interpreting the results.
Actually, setting an appropriate criterion is a dicult
task due to the possible individual and situational variations.
Perhaps individuated criteria will be appropriate instead of a
uniform criterion. Further investigation of the distributions
of gaze points participating in xations and those that are
isolated is necessary.
As reported earlier, within- and between-chunk transi-
tions were similar in that the rst two modal distances were
zero (i.e., loops) and one block. However, these diered in
order and in magnitude. Loops were primary among within-
chunk transitions but secondary among between-chunk tran-
sitions. e opposite was true for the one-block shis. Next,
theproportionsoftheprimaryandsecondarydistancesofthe
within-chunk transitions exceeded the respective propor-
tions pertaining to the between-chunk transitions. Similarly,
there were more long-distance shis between chunks than
within them.
eseresultsseemtosuggestthattheattentionofoursub-
jects was most likely shied, aer a pause, to an adjacent seg-
mentoneblockawayorwithinthesamesegment.emed-
ium or long-distance shis were also separated by pauses,
though their proportions were smaller than the short ones.
Shis without a pause, that is, within-chunk shis, were short,
chiey occurring in the same segment or between adjacent
segments one block away.
Now we turn to a discussion of the frequent patterns (i.e.,
subsequences) extracted by PrexSpan. e patterns were
simple in structure, mostly consisting of single or double
chunks. Furthermore, the chunks themselves contained sin-
gle xations or single loops as expected from the chunk pro-
perties discussed above. More complex structures might have
resulted if we had employed less stringent criteria for the
delimiter. Even so, beneath the structural simplicity, interest-
ing properties emerged as to the segment dierentiation and
the directional unevenness in attentional shis.
First, the within-chunk loops were limited to (A..), (B..),
and (D..), all of which were in the lemost column. While
thepresenceof(D..)wasquitelimited,theleadingrolesof
(A..) and (B..) as prexes in the multichunk sequences are
Applied Computational Intelligence and So Computing
noteworthy. ese roles might be attributable to menu items
placed in the segments. Second, the multichunk sequences
chiey consisted of the segments in rows A, B, and C. In
particular,theleadingroleofAonPandPwasnoteworthy,
liketheloop(A..),thoughitsdualroleaspre-andpostxwas
observed on P. In contrast, A, B, and C were consistently
positionedaspostxes.esameheldforthesegmentsinrow
D, which appeared only at the lowest ms level. e segments
in row E were totally absent in multichunk sequences.
ird, the sequences at ms and ms were more likely to
be horizontal, including homogenous codes, than downward
and, to much less extent, than the upward sequence, which
remained least likely among the additional patterns found at
ms. e order between horizontal and downward sequ-
ences varied across pages at ms.
By chunking eye-tracking records into smaller units, we
discovered interesting properties of the eye movement of web
page viewers. However, further studies seem necessary to
enhance the present approach, for example, by setting up
nested AOI’s to reect the hierarchical structure of the web
objects [] and by adjusting the chunk delimiters to accom-
modate individual and task variations. Besides these rene-
ments, we are planning an application of mined frequent
patterns to simultaneous clustering []ofsubjectsandthe
properties of their eye movement and other relevant indices.
Appendix
We briey explain frequent sequential pattern mining by Pre-
xSpan (prex-projected sequential pattern mining) devel-
oped by Pei et al. [, ]. Interested readers should consult
the original articles for formal descriptions and evaluations
in comparison with other competing algorithms.
Let us use Table as the DB (database) to be scanned.
It consists of four sequences whose elements are nonempty
subsets of items {,,,,,,}.Anelementiscomposed
of a set of items: , , , , , ,and. PrexSpan assumes
thatitemsinanelementarealphabeticallyorderedwithno
duplication, for example, [], [],and[].
e goal of PrexSpan is to nd subsequences fre-
quently shared among the records in DB. A subsequence is
dened as the list of nonempty subsets of the elements of
a given sequence, where the sequential order of elements is
preserved. For example, [][][][] is a subsequence of
[][][][][]. e threshold of frequent occurrence
is called the minimum support (abbreviated as ms in this
paper).Itsvalueistobespeciedbytheuser.
Subsequences of special importance are a prex and the
associated sux. For instance, a frequent item ,withms=3,
can serve as a prex of the ensuing pattern (i.e., the sux) to
bescannednext.epatternslistedinthesecondcolumn
of Table are the sux sequences constituting the -
projected database. Similar databases are to be constructed
for every frequent item. With ms,  and
together will
be considered frequent, where the underline
implies .
Hence,  will serve as a prex, yielding only the two suxes
[
][][][] and [][][].
efcb
ea
aca
e
efc
(ab)dc
eac
adc
acc
eab
eacb
acb
abc
a(bc)a
aa
(ab)f
aba
c
b
a
ab
ba
ac
ad ca
af
(ab)
a(bc)
(bc)a
bc
bdc
ebc
c
dcb
ecb
fcb
cb
bd
bf
db
eb
(bc) cc
dc
fc
ec
(ab)c
(ab)d
ef
f
e
d
F : Network of the frequent patterns extracted at ms (small
letters in dark blue), ms (large letters in dark red), and ms
(underscored). Note: [] is omitted for the single-code chunks and is
replaced by ()for the multiple-code chunks for the sake of simplicity.
See the rst column of Table for the initial sequences.
e network of the frequent patterns extracted at ms =
2,3,andisillustratedinFigure to help grasp the inclusive
relations among them in two senses: (a) nn element of a
frequent pattern is also frequent; and (b) a frequent pattern
at a given ms level is also frequent at a lower level.
More formally, a sequence of length is a prex of
another sequence of length ()consisting of frequent
elements in the database if and only if the rst −1elements
are identical; the last element of is a subset of the th
element of .
e sux of with regard to is a sequence, the rst
element of which is the dierence between the th elements
of and . e remaining elements of the sux are identical
with the ( + 1)th to the last element of ;thatis,
element
1|sux
= element
𝑚|𝛼
element
𝑚|𝛽
;
(A.)
if <
element
𝑗|sux
= element
𝑚+𝑗−1|𝛼
,
=2,...,+1.
(A.)
Scanning with respect to the prex stops when the sux
becomes nil ( = ) or no frequent item exists in the
projected database. is process is executed in a depth-rst
manner for every code initially identied as frequent.
It must be noted that some of the extracted patterns may
be hard to identify in the original sequences, due to the
intermittent removal of infrequent items from the projected
database during the process, for example, the extracted pat-
tern [][][] in Table and the sequence [][][][][]
in Table . is point should be clear to those who are familiar
Applied Computational Intelligence and So Computing
with masking (or wildcard) characters, such as an asterisk
in string matching. One can nd original patterns by
attaching a masking character to the extracted patterns.
Conflict of Interests
e authors declare that there is no conict of interests
regarding the publication of this paper.
References
[] S.Martinez-Conde,S.L.Macknik,andD.H.Hubel,“eroleof
xational eye movements in visual perception, Nature Reviews
Neuroscience,vol.,no.,pp.,.
[] D. D. Salvucci and J. H. Goldberg, “Identifying xations and
saccades in eye-tracking protocols, in Proceedings of the Eye
Tracking Research and Applications Symposium,pp.,
November .
[] G.A.Miller,“emagicalnumberseven,plusorminustwo:
some limits on our capacity for processing information, Psy-
chological Review,vol.,no.,pp.,.
[] D.Melcher,“Dynamic,object-basedremappingofvisualfea-
tures in trans-saccadic perception, Journal of Vision,vol.,no.
, article , .
[] D. Melcher, “Selective attention and the active remapping of
object features in trans-saccadic perception, Vision Research,
vol. , no. , pp. , .
[] J. Ross, M. C. Morrone, M. E. Goldberg, and D. C. Burr, “Chan-
ges in visual perception at the time of saccades, Trends in Neu-
rosciences, vol. , no. , pp. –, .
[] E. Cutrell and Z. Guan, “What are you looking for?: an
eye-tracking study of information usage in Web search, in
Proceedings of the 25th SIGCHI Conference on Human Factors in
Computing Systems (CHI ’07), pp. , May .
[] N. Matsuda and H. Takeuchi, “Networks emerging from shis
of interest in eye-tracking records, eMinds,vol.,no.,pp.
, .
[] N. Matsuda and H. Takeuchi, “Joint analysis of static and
dynamic importance in the eye-tracking records of web page
readers, Journal of Eye Movement Research,vol.,no.,article
,  pages, .
[] N. Matsuda and H. Takeuchi, Do heavy and light users dier
in the Web-page viewing patterns? Analysis of their eye-track-
ing records by heat maps and networks of transitions, Interna-
tional Journal of Computer Information Systems and Industrial
Management Applications,vol.,pp.,.
[] J. H. Goldberg and X. P. Kotval, Computer interface evaluation
using eye movements: methods and constructs, International
Journal of Industrial Ergonomics,vol.,no.,pp.,.
[] J.D.McCarthy,M.A.Sasse,andJ.Riegelsberger,“egeometry
of web search, in People and Computers XVIII—Design for Life,
pp.,Springer,London,UK,.
[] J. Pei, J. Han, B. Mortazavi-Asl et al., “PrexSpan: min-
ing sequential patterns eciently by prex-projected pattern
growth, in Proceedings of the 17th International Conference on
Data Engineering,pp.,April.
[] J. Pei, J. Han, B. Mortazavi-Asl et al., “Mining sequential patterns
by pattern-growth: e PrexSpan approach, IEEE Transac-
tionsonKnowledgeandDataEngineering,vol.,no.,pp.
–, .
[] R. Agrawal and R. Srikant, “Fast algorithms for mining asso-
ciation rules, in Proceedings of the International Conference on
Very Large Data Bases (VLDB '94),pp.,.
[] D.I.Brooks,I.P.Rasmussen,andA.Hollingworth,“enesting
of search contexts within natural scenes: evidence from con-
textual cuing,Journal of Experimental Psychology: Human
Perception and Performance, vol. , no. , pp. –, .
[] A. Preli
´
c, S. Bleuler, P. Zimmermann et al., A systematic com-
parison and evaluation of biclustering methods for gene expres-
sion data, Bioinformatics,vol.,no.,pp.,.
Submit your manuscripts at
http://www.hindawi.com
Computer Games
Technology
International Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Distributed
Sensor Networks
International Journal of
Advances in
Fuzzy
Systems
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International Journal of
Reconfigurable
Computing
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Applied
Computational
Intelligence and Soft
Computing
 Advancesin
Articial
Intelligence
HindawiPublishingCorporation
http://www.hindawi.com Volume 2014
Advances in
Software Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Electrical and Computer
Engineering
Journal of
Journal of
Computer Networks
and Communications
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Advances in
Multimedia
International Journal of
Biomedical Imaging
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Articial
Neural Systems
Advances in
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Robotics
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Computational
Intelligence and
Neuroscience
Industrial Engineering
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Modelling &
Simulation
in Engineering
Hindawi Publishing Corporation
h
ttp://www.hindawi.com
Volume 2
014
The Scientic
World Journal
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Human-Computer
Interaction
Advances in
Computer Engineering
Advances in
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014