4. Discussion

4.1 Discussion of Task Completion Times

No statistical significance was found for any of the five tasks used for this experiment. We suspect that this is a result of only testing 21 subjects. We were surprised to see the wide variety in times for tasks, especially on the tasks that we considered easy. One thing that was demonstrated by the experiment is that humans have different ways of thinking about and approaching problems, and what may appear easy to one user could be very difficult for another.

Task one was a simple search task. Most subjects completed this task in under a minute. However one Textual TOC user took 218 seconds to complete the task. This had a major effect on the mean time and standard deviation for that interface.  Without this subject, the mean time would be 43.2 and the standard deviation would be 30.0. These results are much closer to the results for WebTOC. The fact that the mean time for Netscape was almost 20 seconds faster than WebTOC can be explained by WebTOC load times. It takes longer for WebTOC to load in the beginning and then to expand and contract than it does to load a simple textual page in the Netscape browser.

Task two is a simple sibling comparison task concerning size and content. WebTOC users can complete this task by comparing the size bars. Textual TOC  users had to expand each link in the high level portion of the index to find out how many links are in each portion. Netscape users were forced physically go to each index page to determine how many index links are on that page. In some cases users simply compared the number of links at the upper level of the index. Whether they assumed each page below had the same number of links or didn't realize that there was a second level to the index is unknown. In general, subjects completed this task in 100 seconds or less. As expected, WebTOC users performed well here. The mean time for this interface was nearly 40 seconds faster than Netscape. Surprisingly, Textual TOC  was slower than Netscape. However, one user of this interface timed out (300 seconds) on the task. If we exclude this data point, the mean time becomes 51.5 and the standard deviation becomes 25.5. The adjusted mean time is slightly better than the mean time for Netscape.

Task three was a complex search task. As expected, subjects took longer to complete this task than task one. The relative results of this task are similar to those for task one. However, we were surprised to see that users of both types of TOC browsing methods performed much worse than those of Netscape. Netscape users completed the task on average 80 seconds faster than WebTOC users. We attribute this to the organization of the website and the loss of context experienced by WebTOC users. The Library of Congress's American Memory Site (like most LOC sites) has a two-level index. The first level has a series of "TO" and "FROM" entries. Each "FROM" entry links to a page that has alphabetically sorted links, starting with the "FROM" name on the first level of the index, and ending with the corresponding "TO" name. The "TO" link from the first level will take a user to the end of that particular page. WebTOC only displays the names of the links, and displays exactly one link to a given page, regardless of the number of bookmarks in that page. Therefore, when a WebTOC user first expands the Subject Index, they only see the names of the five "FROM" entries and receive no indication that each of these leads to a page with more index entries. Therefore, some TOC users would expand the subject index once, but when they saw that the Potomac River was not listed, they continued to look elsewhere. This problem was especially apparent for users who cocentrated solelyy on the TOC frame of the screen.

Task four was a sibling comparison task. We were suprised to see that the WebTOC users performed worse on this task than the other subjects. Three WebTOC users timed out, while only one user timed out for each of the other two interfaces. However, one WebTOC user completed the task in 74 seconds, almost as quickly as the fastest Netscape user. We believe that loss of context was an even bigger problem for WebTOC users on this task than on task three. This is because it was not clear to users that a photographer would be considered an author. Thus even if a WebTOC user decided to examine the author index, they would only see two names. Most of these users then continue their search along a different path of the hierarchy. Still, this does not explain why Textual TOC users performed as well as Netscape users. Perhaps testing more subjects would cause these results to converge with those of WebTOC users. In addition to this task being somewhat misleading, some subjects gave a correct answer to the task without correctly completing it. These subjects looked for William Henry Jackson in the subject portion of the index and found a single photograph that had him in it. Eventually they found Carleton E. Watkins in the author index and saw that there were 43 photographs. From this they concluded that Watkins had more photographs in the collection. However, they should have also looked for Jackson in the author index and discovered that he had 31 photographs in the collection before answering. For this reason, this task did not accurately measure what we wanted it to.

Task five was meant to be a task that involved searching for links that were embedded in text. To complete this task, Netscape users had to follow the link for 1872-1889 and then scan all of the associated text, looking for mention of the "The Extermination of the American Bison". On the other hand, TOC users could simply expand the date link, and scan a list of links for the answer. Therefore, we expected users of TOC interfaces to complete this task faster than Netscape users, but found that only the Textual TOC  users were faster. This may be partially attributable to the fact that when there are many links, the bars in WebTOC add complexity to the display, increasing the cognitive load on the user. As mentioned in our background section, complex display density as the number of nodes increased is general problem for flat 2D hierarchical display. Another explanation for the result of this question could be due to the fact that most subjects did not complete this task in the method we intended. Instead they browsed the subject index and found an entry for bison. This gave them a link to the book, and from this link they could answer the question. The date links were at the bottom of the main page, and could only be seen if the user scrolled the screen. Perhaps more users would have followed the date link if they had realized it was there. Regardless, this task did not measure what we expected. For those users who did not follow the date link, it became a second complex search.


4.2 Discussion of Subjective Satisfaction

The results for the subjective satisfaction survey conformed for the most part to our expectations, though the ratings of the Textual TOC group were a unusual.

Shown in Table 3.2 are the means and standard deviations for the various questions divided by subject group. ANOVAs were performed for the raw data from the surveys, but no significant results were found.

Question 1 asked the users to rate the difficulty each task. The question was intended to see whether there is a correlation between the task completion time and subjective difficulty of the tasks. There was a correlation between lower task completion time and lower difficulty subjective rating for task for tasks 1 to 3. Tasks 1 and 2 were ment to be easy, to get subjects used to doing the tasks. Most of the users gave it a rating of 1 or 2. We expected WebTOC to complete the task faster and found it easier with the help of the size bar. Both the task time and subjective rating, though not statistically significant, showed this expected result.

We suspect that the Netscape user found task 3 easier than other groups due to the same loss of context reason mentioned in the earlier discussion on task complement times.

For Task 4, due to the subjects had difficulty finding the answer in the allotted time. Therefore, they gave it a high difficulty rating.

We had expected Netscape users to find Tasks 5 difficult due to the advantage of WebTOC with regard to finding links quickly. Whereas Netscape users had to scan through multiple paragraphs of text to find the answer to Task 5, WebTOC users could merely look through a list of links without wasting time reading the text on the web page. Netscape users rated the task more difficult than the other groups, but the task completion time actually showed otherwise. We provided an explanation for why WebTOC users may have taken a longer time in the discussion on task completion time.

As expected, Netscape users got used to the experimental setup the fastest, and found it easiest (survey question 2). However, it is remarkable that the subjects in the other two groups gave a mean rating of around 2.0, with a standard deviation of approximately 1.0. This means that the users did not have much difficulty learning WebTOC. This could be due to two reasons: the high level of computer experience within the groups, and the simple, intuitive nature of the application.

Although this study did not measure the feeling of disorientation directly, the subjective satisfaction survey did test a subjective degree of organization of the site. We can relate this measure to previous findings about disorientation effects in hypertext navigation. The higher the feeling of organization of a site, the smaller are the chances of disorientation, since if a user thinks that a site is well-organized after browsing through it, he or she would probably feel less disoriented while traversing the site. Our results show that users of WebTOC gave the highest degrees of organization to the web site. This suggests that a table of contents containing size bars increases user perception of the organization of a web site and thus reduces disorientation while browsing it.

In response to Question 4 on the survey ("Do you think an on-screen table of contents is a good idea for a web site?"), all the subjects, irrespective of group, responded affirmatively. This indicates that tools such as WebTOC are in demand by users, and is motivation for further research in the area.

Question 5 asked the subjects whether the table of contents helped them complete the tasks quicker than they would have using a regular browser. Naturally, subjects in the Netscape group were not asked this question. Six of the seven subjects in both the WebTOC and Textual TOC group felt that the table of contents was useful to them. Again, this is a positive sign for WebTOC.

Some general observations about the results of the survey are appropriate. First, it seems that as tasks got more complex, WebTOC users seemed to handle them much better than Netscape users, who gave increasingly higher difficulty ratings to tasks in the latter part of the experiment. Secondly, it is fair to state that most users were already familiar with Netscape, so the training they were given was less significant than that given to the WebTOC group. With increased training, WebTOC users would find the tasks easier than they did. Thirdly, it is interesting that the mean difficulty ratings given by the subjects to the tasks are commensurate with the mean completion times recorded for the different groups.

Finally, the WebTOC group was most satisfied with the tools they were given for this experiment. This points out that a hierarchical table of contents enhances a user's satisfaction on the a web site, especially when supplemented with graphical indicators. We encourage further research into this area, as the results of our survey seem to demonstrate enthusiasm for tools like WebTOC.

In the comments section of the user satisfaction survey users made remarks about the web site, WebTOC and the experiment. Several subjects found the arrangement of the entries in the subject and author index of the website problematic and difficult to access. These complaints were made even though we specifically showed how the indices could be used during training and included a training task using the indices. This is a strong indication to the LC web site developers that another format should be considered for the indices. Many subjects found the long loading time distracting. One Netscape subject mentioned that having to scan through text to answer task #5 (link embedded in text) was difficult.

A Textual TOC subject mentioned that she would have used TOC more if it were better organized. A user suggested that it may be easier to find a link through the TOC if the links were alphabetized. Another Textual TOC subject commented that there were too many words on the TOC. A WebTOC user suggested that the WebTOC entries would be more meaningful if they weren't just link titles. All these comments show that users are finding the organization of the table of contents problematic. Further development and testing should be conducted to identify a display which is easier to read.

One WebTOC subject criticized that we had too few tasks that use the size bar, making it difficult to answer the user satisfaction question on its usefulness.



Continue
Return to the Title Page for WebTOC Evaluation