Is Anthropomorphic Design a Viable Way of Enhancing Interface Usability? |
||||||
Chapter Four: ‘Computer as Tool’ vs. ‘Computer as Friend’ - an Empirical Study into the Use of a ‘Celebrity’ Interface. |
||||||
After considering the arguments concerning anthropomorphism in interface design along with its general lack of commercial success - and questioning users on their habits and views on actual and fictional interface models, it was decided that the best way to further investigate this issue would be to conduct some user tests. Previous studies into the effects of humanlike design have tended to employ computer-based games in order to test users. In this research a word processor was used, as the focus of interest was on the everyday application of anthropomorphism, rather than establishing the existence of interaction. A word processor was deemed a more appropriate representation of a ‘real world’ context. |
||||||
4.1 Method |
||||||
4.1.1 Design and Manipulation The applications contained a simple range of functions that you would expect to see in a normal word processor. There was nothing ‘unusual’ about the graphical design - standard sets of icons were used and layout conventions were observed. Each program contained an integrated help system which was written in html. In a similar test, using humanlike and machinelike conditions (De Laere, Lundgren & Howe, 1998) The authors wrote the message text of the machinelike condition in capitals – this may have placed too much emphasis on the ‘machine’ and affected the results, as the use of capitals in text-based interactions is generally considered to be ‘shouting’ and could be regarded as hostile by users. Steps were taken during the design of Version B, in an attempt to avoid potentially negative influences such as these, which could unintentionally distort the results. The message text was written in a fluent style (Brennan and Ohaeri 1994) and addressed users as ‘you.’ (see Fig. 4.1) Version B was essentially designed for use as a ‘control’ version, to be studied in comparison with Version A. |
||||||
|
||||||
4.1.1.2 Lush Writer Version A –
Computer as Friend After examining data from the questionnaire and other literature, it was determined that the best way to apply anthropomorphism in this instance would be through text-based manipulation. This was based on the observation that users tend to interact with their computer holistically, as a ‘person.’ When interacting with a character on the screen, such as Clippy, they might feel as though they are dealing with a ‘friend within the friend’ (which could be considered an unnatural form of social interaction), rather than the ‘computer as friend’ (natural interaction). This variation in natural or unnatural interaction may explain why some research (Nass, Moon, Fogg, Reeves & Dryer, 1995, Nass,1998, Isbister & Nass, 2000) has shown that users apply different social rules to on-screen characters than they do to characters which are implied by text based manipulations. The decision to use text was considered sufficient in order to stimulate the desired level of interaction – as mentioned in Chapter One, it has been well documented that only minimal cues are needed in order to create a computer ‘personality’ (Nass, Moon, Fogg, Reeves & Dryer, 1995). There was no need for artificial intelligence or natural language capabilities. The design also observed the use of consistency (Isbister & Nass, 2000). Results from the web-based questionnaire (Chapter 3) were used to select the ‘celebrity personality’ which would be exhibited by Version A. Although Bob Marley came top in the popularity scoring system, it was decided that his personality would be harder to implement as the message text would have to be written in a Jamaican ‘accent’ which was considered more likely to irritate or possibly even offend users. Ultimately, Elvis Presley was used instead. This was because it was considered that he would be easier to characterise and more widely recognised by users. Elvis’ personality was represented through the style of language used in the message text, the tool text and the help system. |
||||||
|
||||||
The messages used the first person pronoun (‘I’ and ‘my’) – (defined as anthropomorphic by Brennan & Ohaeri, |
||||||
|
||||||
Acknowledging that other humanlike interfaces may have failed because they are too prominent, attempts were made in this prototype to ensure that the methods of interaction were more subtle and potentially less distracting or irritating to users. To balance out concerns that the system might subsequently be too subtle and that users might ‘miss the point’ a ‘label’ (Nass, Reeves & Leshner, 1996) was used - the system introduced itself as ‘Elvis’ at the beginning of the session. The participants were divided into 2 groups, with evenly matched computer abilities. The user test was comprised of two sections. The first section was intended to compare the usability of the two systems; the second section was designed to gauge user reaction to each version. Participants followed a set of paper-based instructions and were not informed as to the intention of the experiment, so that they would not be prejudiced in their judgment. The first section of the test involved both groups completing the same two tasks (editing a letter and typing in and editing a passage of text). One group used Version A and one group used Version B. Both versions recorded the user’s actions in a text file (stats.txt), so that they could be compared in order to measure whether one version of Lush Writer was more usable than the other. The second section of the test required users to switch versions (ie those who had been using Version A, now used Version B, and vice versa) in order to complete a further two tasks. The tasks were different to the ones used in section one, but were matched in type and complexity. A SMOG readability formula was applied to ensure that both passages of text were of the same difficulty level. The tests were designed to be reasonably difficult. The passages of text contained Latin and Japanese phrases written in italics, in order to exercise the user’s dexterity. Both sections together were intended to take an average user about half an hour to complete. |
||||||
4.2 Results |
||||||
4.2.1 User Test Section One –
Usability and Productivity When compared, it was found that most participants completed the test more quickly using Version B (the non-anthropomorphic interface) than Version A. The average time taken to complete Version A was 1125 seconds (approx. 19 minutes), the average time taken to complete section B was 1005 seconds (approx. 17 minutes). When progressing through Section One users were required to perform a set amount of actions, which required a minimum of 12 elements to be selected in sequence. If a user was having difficulties using the program they may have clicked on an item more than once, or selected several irrelevant elements, whilst looking for the right item. In order to indicate a user’s proficiency at using the system, the amount of element selections made by each user in Section One was counted and evaluated in relation to the number of minimum clicks that they should have made (the greater the number of clicks = the less proficient the user). The statistics gathered from users of Version A were then compared with those of Version B to see if users were more proficient at using one system than the other. 14 participants used only 12 clicks to complete section one, which was the minimum amount of clicks required to finish the task. These users could be considered very proficient at using the system. The greatest amount of clicks made was 21. The average amount of clicks made by users of Version A was 13. The average amount of clicks by users of Version B was 15. So, in this test, users of Version A were more proficient than those of Version B. The average amount of errors made by users of Version A was 6. The average amount of errors made by users of Version B was 4. This indicates that Version B may induce more accurate input than Version A. |
||||||
4.2.2 User Test Section Two - User Preference |
||||||
4.2.2.1 Ease of Use 4.2.2.3 Enjoyment of Program Figure 4.4 shows a breakdown of user enjoyment of Version A: |
||||||
|
||||||
Users of Version B were more indifferent in their perception of the system, with 70% stating that they neither enjoyed nor disliked it. (see Fig. 4.5) |
||||||
|
||||||
4.2.2.4 Favourite System |
||||||
4.3 Discussion |
||||||
|
As with the questionnaire, the scale of this research was very small and it was not intended to be a definitive study, merely an exploration of the issues raised in the previous three chapters. The participants were mainly students or young professionals and are not considered representative of a cross section of society. One concern outlined in Chapter Three is that anthropomorphic characters may seem cute at first, then silly and ultimately distracting (Shneiderman & Plaisant, 2004: 487). Although the test was designed to be relatively long and monotonous, a more effective way to study the ultimate effects of humanlike characters would be to investigate how they are received in the long term. Unfortunately, it was not feasible to do so for the purposes of this report. Where Version B was found to be faster to use than Version A, this may have been because ‘Elvis’ presented more messages, which would have taken longer to read and respond to and could also have distracted participants from the task. These findings correspond with those of previous studies. To date, little research into the effects of anthropomorphism has been able to prove that it has any enhancing effects on usability. If anything, it has been proven to divert from, rather than increase productivity. Of the participants who had used Version B first, 83% expressed a preference for Version A. This may suggest that after enduring the monotony of the first section, they were pleased to have a little light relief in the form of ‘Elvis.’ The fact that Version B was a ‘vanilla’ interface with no frills may also have influenced this decision, as Version A was more memorable. It should also be noted that Version A was more maligned – with 25% of users stating that they did not like it. It tended to provoke a stronger response from users than Version B whose lack of significant characteristics caused 70% of users to assert that they neither liked nor disliked it. Positive comments made by users included that they would like to be able to talk back to the system, they would like a range of characters to choose from and that they would like to be able to turn it off easily if it became annoying. Although the idea of the celebrity interface seemed obvious, it was rather hard to implement. (I am not an Elvis fan and would have preferred to ‘do’ Mr T or Ozzy Osbourne). Trying to think of relevant ‘catchphrases’ and writing a help system in a southern twang proved quite difficult. The current trends in customisation or personalisation of items such as computers, cars and mobile phones imply that if the ‘celebrity interface’ was implemented appropriately it could be successful. Formal methods would need to be developed in order to model the personality of a celebrity. If a living celebrity was used then it would probably be better to involve them during the design stage to create an authentic range of responses. Legal rights to the use of name etc would also need to be established. A huge assortment of phrases and words would have to be incorporated so that the user did not receive the same responses over and over again. A selection of celebrities (including ‘none’ – for the 25% who did not want to hang out with anybody), would be necessary in order to cater for the diversity of user’s personal taste. Is it harder for users to dislike a character they already know? Microsoft’s endeavour to create a ‘new’ character, which would be successful in it’s own right (like the Californian Raisins mentioned in Chapter Two), was probably a poor judgment. If they had used Bambi or Mickey Mouse, yes they would have to pay some royalties, but they may have been more successful. If users saw someone they ‘knew’ on the screen it might make them more confident in their interaction. If the celebrity interface was implemented effectively, with considered methods of interaction and accurate displays of personality, could it really be an effective method for calming users nerves and mitigating frustration? |
||||||
© Alison Flind 2006 |