| |
Authors
Abstract
The number of users of handheld computers has been increasing rapidly
for the recent years. The most common usage of a handheld is the entry
of new addresses or things-to-do or other small notes. Thus, input entry
into a handheld device has become one of the important issues. Two major
input entry methods are handwriting using Graffiti alphabet, which is a
hand stroke based handwriting recognition system, and tapping on a soft
keyboard. In this project, we aimed at comparing Graffiti and keyboard
tapping while doing a common task, which involves entering alphanumeric
characters and special symbols and switching between keyboards during keyboard
tapping. The purpose of this experiment was to see which input entry method
is faster and more accurate than the other and to observe the pattern of
learning for both methods. Experiments applied to 15 subjects produced
statistically significant results. The analysis of the data showed that
using keyboard tapping yielded faster and more accurate results in both
the initial and later use. Despite its poor performance in the initial
use and high number of errors, using Graffiti became much faster as the
usage time increased. The learning curve suggests that experienced users
may perform faster with Graffiti than keyboard tapping.
Introduction
With the rise in technology over recent years, users have become accustomed
to having all of their data readily available to them at all times.
In order to facilitate this, the handheld computers were introduced in
early 90s and they have been very popular in recent years. They are very
convenient in the applications that only require pointing and selecting,
which can be performed by a stylus - a computer pen. On the other hand,
the most common usage of a handheld is the entry of new addresses or things-to-do
or other small notes. In the applications requiring data input such as
entering address information, the problem of data input has arisen because
of the small size of the handhelds. Although full size keyboards can be
connected to the handhelds, it is not so practical and not available to
everybody.
To address this issue, the manufacturers of handheld computers have
created two main data entry methods:
-
Soft keyboard, which is displayed on a small portion of the screen of the
device and works by tapping the stylus on the characters
-
Graffiti, which is a handwriting recognition method and accepts handwritten
characters that are written on a designated location on the handheld device
and converts them into ascii characters.
The figures below (taken from Palm Handbook) present screen shots from
a Palm Pilot IIIxe, showing keyboard and Graffiti features.

Both input methods have their own unique advantages. Unfortunately both
have their drawbacks as well. The main advantage of soft keyboard tapping
is its similarity to an external keyboard: Users with a strong familiarity
with keyboards usually feel comfortable when they use keyboard tapping.
On the other hand, it has three major disadvantages:
-
Users can easily tap an incorrect character because of their close proximity
to each other.
-
The keyboard covers nearly 40% of the screen space, which is already small,
and requires extra scrolling.
-
It lacks of kinesthetic feedback and the inability to have a reference
point [13]. Hence, visual contact with
the on-screen keypad must be maintained during entry.
To overcome the difficulties posed by soft keyboard tapping, handwriting
recognition methods have become popular. Blickenstorfer [2]
analyzes 18 handwriting recognition systems and Gibbs [3]
compares 13 handwriting recognition methods extensively. One of the most
popular handwriting recognition systems is Graffiti [1].
It is also similar to a keyboard in a sense that the input is character
by character and it requires special modes for uppercase letters and special
characters. As stated in [6], the major advantage
of Graffiti is that it mimics the Roman alphabet as closely as possible
while trying to preserve single stroke handwriting philosophy. This allows
the users to write characters fast once the user is familiar with the alphabet.
Figure 1 presents the alphabetic characters in Graffiti handwriting, where
the black dot represents the beginning point. One other advantage is the
elimination of need to look at the screen while writing, which also helps
fast writing. However, it is still a language to be learned although it
resembles the usual handwriting. The other problems are the accuracy of
the recognition and the retention time of learning especially special characters.
The graffiti alphabet [MacKenzie97]
Overview of Previous Experiments
There has been a great deal of research conducted in the area of comparison
of different input entry methods on handheld devices.
In [4], three methods of character entry
on pen based computers, namely hand printing, ABC keyboard tapping and
QWERTY keyboard tapping, were compared for speed, accuracy, and user satisfaction.
The ABC tapping had the lowest error rate of the group, at 0.6%, and also
the slowest entry rate at 12.9 wpm. The QWERTY tapping had the fastest
input rate, at 22.9 wpm, and the lowest error rate, at 1.1%. The hand printing
method had an input rate of 16.3 wpm, and an error rate. The user satisfaction
surveys showed that QWERTY tapping is the most preferred one while the
least preferred was the ABC tapping method.
In [8], six different soft keyboard entry
methods were tested to find the speed of those input entry methods. The
average number of words per minute was 20.2 for QWERTY, 10.7 for ABC, 8.5
for DVORAK, 8.0 for Fitaly, 8.0 for telephone and 7.0 for JustType.
In [6], the learning speed of Graffiti
was measured. The subjects were tested after one minute of studying the
Graffiti reference chart, five minutes of practicing with Graffiti, and
after one week without practicing the graffiti input method. The accuracy
rates were 86%, 97%, and 97%, respectively, which showed that the learning
speed is quite fast for Graffiti.
In [9], two methods of numeric entry on
pen based computers were tested, namely handwriting and pie pad. The study
attempted to measure the learning speed and accuracy of two methods. Error
rates did not significantly change, but the entry speed did: Speed by handwriting
increased by 11%, while the speed by the pie pad increased by 52%. Initially,
handwriting was the faster entry method; however at the end of experiments,
the pie pad method was 24% faster than handwriting. The subjective surveys
showed that the pie pad method was preferred to the handwriting.
In [7], two different handwriting recognizers,
Microsoft character recognizer and CIC's Handwriter 3.3, were tested. The
two methods were tested based upon an input text of only lower case letters
and also an input text of both upper and lower case letters. The results
of the study showed that certain characters were misinterpreted significantly
more often than others, and also that the observed accuracy was lower than
that claimed by the developers of the products.
In [5], several methods for entering alphanumeric
data to pen based computers were examined. The input entry methods included
hand printing, tapping on a soft keypad, stroking a moving pie menu, and
stroking a pie pad for numeric entry, and were hand printing, tapping on
a soft QWERTY keyboard, and tapping on a soft ABC keyboard for text entry.
For numeric data, soft keypad yielded the fastest and most accurate results
(30 wpm, 1.2% errors) while the moving pie menu gave the slowest and more
error prone results (12.4 wpm, 16.4% errors) for numeric entry. For the
text input, soft QWERTY keyboard tapping was the quickest, at 23 wpm and
most accurate with 1.1% errors. Hand printing was slower at 16 wpm and
had a higher error rate, at 8.1% errors. Finally, tapping on the soft ABC
keyboard was the most accurate, at 0.6% errors, but also had the slowest
input rate, at 13 wpm.
In [11], a theoretical model was presented
to predict the upper and lower bounds for text entry rates using a soft
QWERTY keyboard on a pen based computer. The model was based on the Hick-Hyman
law for choice reaction time, Fitts' law for rapid aimed movements, and
linguistic tables for the relative frequencies of letter pairs, or digrams,
in common English. The model predicted that a typing rate of 8.9 wpm can
be achieved for novice users and 30.1 wpm for expert users of the soft
QWERTY keyboard.
Experiment
Overview and Variables
There have been many studies comparing Graffiti handwriting and keyboard
tapping in terms of speed and accuracy. However most of these studies have
been focused on the comparison of speed and accuracy for either only alphabetic
characters or numeric values. Moreover, it was always assumed that the
users enter all those information on one screen. However, in most of the
applications, the input is a combination of those along with punctuation
symbols. They also switch between screens or make scrolling to locate some
other information. In this project, we aim at comparing Graffiti and keyboard
tapping while doing a common task, which involves entering alphanumeric
characters and special symbols and switching between keyboards during keyboard
tapping. We will concentrate on entering the address information of a person
(name, phone number and home address each in different fields).
The purpose of this experiment is to see which input entry method is
faster and more accurate than the other and to observe the pattern of learning
for both methods. We are mostly interested in how the time spent entering
text enhances or reduces the speed and accuracy of writing.
There are two independent variables in our experiment. The first
one is the input entry type which has two treatments: Soft keyboard tapping
(QWERTY keyboard) and Graffiti handwriting. The second independent variable
is the number of trial blocks. It will have 4 treatments showing the degree
of learning. In each trial block, the subjects are asked to write a specific
number of addresses.
There will be three dependent variables: Time for correct completion
of the task, the percentage of errors encountered and a subjective satisfaction
survey.
Our hypothesis is that for novice users, keyboard tapping is
faster and more accurate than the Graffiti handwriting. However, as the
experience of the users increases, using Graffiti will lead to a faster
entry of text while better accuracy is still achieved by keyboard tapping.
Pilot Test Results
After we conducted pilot tests on 4 subjects, we decided to make the following
changes in our experiment.
-
Reduce the number of addresses in each trial block from 5 to 3:
The first two experiments showed clearly that the time required for entering
5 addresses for each trial block is too much for the subjects. Writing
with Graffiti took about 1.5 - 4.5 minutes and using internal keyboard
tapping took 1.5 - 3 minutes.
-
Reduce the size of each address: We decided to remove some of the
fields in the address to shorten the length of the experiment. The original
addresses included e-mail addresses and two phone numbers, and we decided
to keep only one phone number instead of them.
-
Change the addresses to have a uniform distribution of characters in
each trial block: We decided to keep the number of characters, number
of digits and number of punctuation symbols in each trial block same for
a better and more accurate comparison.
-
Give one address at a time to avoid confusion instead of all addresses
on the same trial block at once: Firstly, we proposed to give all addresses
in each trial block on the same paper. The pilot tests clearly indicated
that this may confuse some of the subjects. Instead of concentrating on
entering data on the Palm Pilot, they spent some time on locating what
to write on the paper, which in turn has a negative effect on comparing
data entry methods.
-
Change the format of the addresses: The pilot tests showed that
the format of the addresses we proposed previously is inappropriate for
this experiment. The order of the entries for each address (first name
- last name, address, city - state - zip code, e-mail and phone number
in each line) distracted the users more than we expected. For example,
they mostly entered the first name in the field for last name. To reduce
the effects of the time spent in locating which information to put on each
field on the total data entry time, we decided to give the addresses in
the exact format which they will be entered on a Palm Pilot (i.e., in the
same order as on the Palm Pilot and specifying the field name for each
information).
-
Counting the errors and measuring the time for correct completion:
The number of errors should be counted to avoid faster handwriting with
lots of errors in it. For a better analysis, we decided to categorize the
errors into capitalization errors, alphabetic characters instead of digits
or vice versa, replacement, insertion, deletion, and transposition errors.
We will also try to count the number of errors for each character (what
to write vs. what is written). The time to enter each address will be measured
after correcting all errors in the entry, i.e. correct completion is a
requirement. We also decided to remind the subjects whenever they make
errors but leave the decision of when to correct the errors to them.
-
Reduce the effects of tiredness and boredom: After completing second
block, we observed that the subjects may get tired and bored of the task.
So, after completing each trial block, each subject will be allowed to
rest for some time (probably 1 minute)
Subjects
We conducted our experiments on 15 subjects. All but one of them are students
in University of Maryland. The major concern about the selection of subjects
is that they have not used a handheld before and do not know Graffiti at
all.
The distribution of subjects with respect to demographic properties
are as follows: Of the fifteen subjects, 9 of them are male and 6 are female.
11 of them is between ages 20 and 30, 3 of them is below 20 and 1 of them
is between ages 40 and 50. 10 of them are students or professionals in
computer science while 5 of them are not. 9 of them are using glasses or
contact lenses while 6 of them are not. The average rating for level of
keyboard usage is 6.2 out of 9 and the average rating for Graffiti knowledge
is 1.2 out of 9 (1 represents no knowledge and 9 represents a strong familiarity).
Conducting
Experiments
We asked the subjects to enter 4 sets of addresses into the address book
of a Palm Pilot. Each set consisted of 3 addresses. They entered all addresses
first using Graffiti and then the keyboard tapping, or vice versa for eliminating
a bias towards one of them. One example address is as follows:
| Last name |
Maxfield |
| First Name |
Paul |
| Home |
(240) 698-3571 |
| Address |
594 Lovers Ln. Apt 16 |
| City |
Bethesda |
| State |
MD |
| Zip Code |
20378 |
While preparing the address set for each trial block, we took care of a
uniform distribution of characters in each trial block. All 4 trial blocks
consisted same number of total characters (190), same number of capital
letters (25), same number of digits (60) and same number of punctuation
symbols and spaces (28). The total set of addresses and the distribution
of characters in each trial block can be found in the Appendices.
The subjects were chosen assuming that they have not used Palm Pilot
before and have no knowledge of Graffiti or keyboard tapping. Therefore,
we trained the subjects about
-
how to enter an address in the address book
-
how to write using Graffiti
-
how to write using soft keyboard tapping
This training session is meant to be an introductory session. The subjects
are allowed to enter all the characters in the addresses once. The purpose
of the experiment is to observe the learning curve for both input entry
methods so we kept the training session as short as possible.
The subjects were asked to sign an informed consent form and fill out
a background survey for only statistical purposes, which can be found in
Appendices. Then, they were asked to enter 12 addresses correctly and completely
as they appear on the address sheet given to them. In other words, all
fields must be entered into the correct fields on the address book and
all spelling errors must be corrected to finish an address. Once they finish
writing all addresses using one input entry method, they repeated the same
process for the other method. While using one input entry method, they
were not allowed to use the other one.
In the first two trial blocks, they had lots of difficulties remembering
how to write each character using Graffiti. Thus, they were allowed to
use a quick reference guide to Graffiti whenever they needed it.
We measured the time for completing each address correctly and the number
and type of errors they made. In order to finish the task, they are expected
to write the addresses exactly as in the address sheet given to them. We
classified the number of errors as replacement errors, capitalization errors,
character vs. digit errors, missing letters, insertions and transpositions.
The subjects are warned to correct the mistakes they did not realize while
writing them.
After writing all addresses, the users are asked to complete a subjective
satisfaction survey, which can also be found in Appendices.
Since we observed the learning curve of two data entry methods, the
experiments took lots of time with respect to other experiments. The time
for completing all tasks for each subject is about 1-1.5 hour. That is
the reason we could not conduct the experiments on more subjects. One other
major problem we encountered was the difficulty of measuring the number
of errors and the type of errors.
Results
The raw data, which can be found in Appendices, lists the time for correct
completion and the number of errors for both input entry methods for all
subjects. We, therefore, have four sets of numbers: Graffiti speed, keyboard
speed, Graffiti errors and keyboard errors.
The mean and standard deviation for the speed and number of errors in
each trial block for both methods are given below:

The mean speed and error number presents the learning curve of both
methods. This can be seen clearly on the following graphs, which show the
distribution of speed and number of errors in each trial block for both
data entry methods.

The Entry Speed graph displays the average entry time for a single address
within a given trial block. This average was obtained by taking the
subjects' mean entry times, within a trial block, and then averaging them.
The black bars at each point on the graph display the standard error.
This error is very small on the Keyboard graphs, due to low variance, making
these bars difficult to see.
Similarly, the Entry Error graph displays the average number of errors
occurring within a single address entry, within a given trial block.
Once again, the average was obtained by taking the subjects' mean number
of errors, within a trial block, and then averaging them. Black error
bars display the standard error.
While the Graffiti entry and error numbers take a dramatic drop, the
keyboard numbers are fairly constant. Overall, the graphs display
a convergence, in both the number of errors made and the number of seconds
taken to enter an address, over time.
The gulf in entry times in the first trial block is approximately 135
seconds. This reduces to a difference of only 31 seconds in trial
block 4. The standard deviation of the Graffiti entry times also decreases
at a higher rate. These same patterns are also apparent in the number of
errors made.
To determine the statistical significance of this data, a two-way analysis
of variance was employed since we have two independent variables having
2 and 4 treatments, respectively. Two two-way ANOVAs were performed
using Microsoft Excel 2000, one on the entry rates and another on the error
rates.
The important statistic in the ANOVAs is the P-value for Columns.
This value indicates whether the numbers in the Graffiti and Keyboard times
vary too much in relation to each other. A P-value of less than .05 indicates
that the probability of the results not being related. As shown in the
following figures, it is less than 5%. This is the proof of statistically
significant results. The results of these ANOVA tests are as follows:

Discussions
The values of both P-values in the Columns row are far below 5%, thus
proving that the data used in this project is statistically significant.
Now that the validity of the statistics has been established, and the meaning
of the statistics explained, one may move on to an interpretation of these
numbers.
The previously noted convergence of the graphs is indicative of the
learning curve involved in using Graffiti for data entry. It is our
theory that, though Graffiti takes longer to learn, it is eventually faster
than the keyboard for data entry. Though our data does not definitively
support this, it certainly shows a trend toward this model. The rapid
descent in entry times, and the decrease in the corresponding standard
deviation, supports our theory that the speed of Graffiti usage increases
rapidly over time, for the majority of users. The trend indicates
that, given more trial blocks, tests would show that the entry speed of
Graffiti surpasses that of the keyboard. This hypothesis is further
supported by the relatively stagnant entry times reported for keyboard
usage. The growing overlap on data ranges, over time, is also of note.
The error rates, as expected, are continuously higher for Graffiti.
There is a substantially higher probability of errors while using Graffiti,
due to natural variance in hand motion and the software's ability to interpret
those motions. The keyboard interface eliminates these variables
and provides a much more structured interface, thereby proving much less
error prone. Though the number of errors for Graffiti decreases, it does
not approach the Keyboard numbers as quickly as Graffiti entry times close
the speed gap. For these reasons, as supported by the data, we expect
that Graffiti will always involve more errors than Keyboard entry.
As previously stated, our results do not show Graffiti entry times crossing
below keyboard times on the graph. Our graphs do, however, show a statistically
significant trend toward the probability of that crossing with further
testing. We were unable to perform further testing due to the length
of time involved in performing these tests. Further experimentation
is suggested over a longer time period where testing can be performed on
a daily basis, instead of a contiguous 1-2 hour period.
In addition to these statistical results, the user satisfaction surveys
highlighted a number of notable items. On a 1 to 10 scale, the difficulty
of using Graffiti was rated 4.4 on average, as opposed to a 1.5 for keyboard
usage. All subjects found the keyboard easier in the first trial
block, but 43% of subjects rated Graffiti as easier by the last trial block.
Subjects stated that they achieved comfort with Graffiti and Keyboard usage
at approximately the same trial block. 50% achieved comfort with
either interface by the 2nd trial block, with the majority of the rest
attaining comfort level in the 3rd trial block. Ease of error correction
was evenly split. In the end, entry method preference was evenly
divided between the two methods.
Subject satisfaction relies on the features and usage experience of
an interface. Subjects enjoyed the simplicity of the keyboard interface
and its low error rates. The "computer like" keyboard appeared to
be very intuitive to subjects, offering a small or no learning curve.
This accessibility is a great advantage to new users of the Palm Pilot.
Additionally, the keyboard interface limits user input to a strongly defined
set. This reduces the error rate and associated frustrations.
There were also several complaints about the keyboard interface.
Primarily, these complaints centered on the small size of the keyboard,
and the need to constantly switch back and forth between displays.
The keyboard method involves a lot of switching between entry areas and
different keyboards. Subjects found this switching to be distracting
and noted that the keyboard interface seemed slower because of this. Two
problems spots were commonly notes. These were the precise manner in which
the 'P' and scroll arrow keys had to be pressed to register correctly.
The features that most attracted subjects to Graffiti were its continuity
and similarity to handwriting. Subjects found that being able to "write"
quickly using Graffiti caused them to perceive the time taken to enter
data as faster. Also, because all characters were constantly accessible,
on the same screen as the data entry area, subjects weren't constantly
context switching. Another common stated advantage of Graffiti is the fact
that there is no need to fix the eyes on the screen during writing, which
facilitates the writing using Graffiti.
There were a number of complaints about the Graffiti interface. Most
complaints dealt with the high error rate caused by characters being entered
incorrectly. A major issue is the counter-intuitive shape of some
of the characters. Some, like the 'T', do not resemble their standard hand-written
counterparts. Variations in handwriting style can also make character entry
difficult. Many subjects noted difficulties with capitalization and
punctuation. All these are sources for the high error rates witnessed
in Graffiti and the level of irritation present in its learning curve.
Common error characters are: V, T, 4, L, E, Q, K, N, Y, 9, P, G,
X and the parentheses and capitalized letters. We observed that during
the initial use of Graffiti, the characters which require more than 2 strokes
(such as B, D, P) frustrated the users because this situation caused two
letters appear on the screen. This is because the Graffiti handwriting
recognition system perceives each stroke as one letter and whenever the
users lift their hands off the screen, it is perceived as one character.
A few subjects took a lot of time to discover this and made lots of mistakes
until they got used to it.
Conclusions
In this project, we conducted an experiment to compare the speed and
accuracy and to observe the learning curve of Graffiti and keyboard tapping
during a common task on handhelds. Using 15 subjects with no knowledge
about Graffiti and keyboard tapping, we measured the time and the number
of errors while using those two data entry methods to enter 4 sets of addresses
into the address book on a Palm Pilot. During the first trial block, using
keyboard lead to faster entry and fewer errors. Although using Graffiti
takes much more time than using keyboard tapping in the initial trial blocks,
the later results showed that the time to complete each trial block using
Graffiti decreases rapidly, nearly catching the times for the keyboard
tapping. The number of errors made by the subjects remain higher for Graffiti
with respect to the keyboard tapping. The two-way ANOVA tests showed that
the results are statistically significant.
Impact for Practitioners
The keyboard tapping is easier to learn, less error prone and faster than
Graffiti. It is also easier to correct errors. The main problem with the
keyboard tapping is the excessive number of switches between screens during
a common application. To overcome this problem, manufacturers can try to
avoid these switches by employing some other mechanism, such as making
the keyboard accessible with no need to switch to some other screen. As
an example, Silkyboard [10] attempts to use
thesame area for both data entry types and recognize which of them is used
at any time. Thus, it avoids switches bertween screens and allows both
data entry types to be used simultaneously.
Another suggestion for the practitioners to obtain a faster data entry
is to get advantages of both Graffiti and keyboard tapping and develop
another data entry method. This may include a mechanism of writing some
characters using Graffiti and some others using keyboard tapping, while
minimizing the switches between screens. As an example, TapPad [12]
expects Graffiti on the alpha side (alphabetical character side) of the
Graffiti area, as usual, but on the numeric side the users can use both
data entry methods.
Finally, the Graffiti characters may be replaced by another set of characters
by substituting some difficult to write characters by others. The best
way is to develop a recognition system where the alphabet is exactly the
same with Roman alphabet. This may be a difficult task but it seems the
ultimate solution.
Suggestions for Future Researchers
We believe that using Graffiti for more time (i.e. more trial blocks) will
yield faster results than the other method because of high delays faced
during switching screens when the keyboard is used. To explore this, a
similar experiment should be conducted among experienced Palm Pilot users.
We also believe that conducting an experiment for suggesting alternative
characters instead of difficult characters will be useful in case of Graffiti.
Our experiment results clearly indicate that some characters are too different
than most users' handwriting and ways to improve the recognition of them
are worth to explore.
Refinement of the Theory
Our hypothesis about the learning curve has been supported by the experiment
results, which are statistically significant. However, the time for Graffiti
in the last trial block turned out to be more than the time for the soft
keyboard as opposed to our initial hypothesis. We believe that this is
because:
-
We did not use enough number of subjects
-
The total amount of time spent on Graffiti is not sufficient to be an expert.
If a similar experiment is conducted with taking the considerations above
into account, we believe that our hypothesis will be verified with statistically
significant results. |