Content area

Abstract

This paper seeks to present a comparative study of the traditional usability-testing process and the re-engineered usability-testing process for live multimedia systems. The paper provides an overview of current usability-testing techniques and usability laboratory configurations, and identifies some gaps in the traditional usability-testing approach. Traditional usability-testing procedures are suitable for testing systems in the static environment but prove to be sub-optimal in testing systems for dynamic (real-time) environments.

Full text

Turn on search term navigation
 
Headnote

Abstract

Purpose - This paper seeks to present a comparative study of the traditional usability-testing process and the re-engineered usability-testing process for live multimedia systems.

Design/methodology/approach - Provides an overview of current usability-testing techniques and usability laboratory configurations, and identifies some gaps in the traditional usability-testing approach.

Findings - Traditional usability-testing procedures are suitable for testing systems in the static environment but prove to be sub-optimal in testing systems for dynamic (real-time) environments.

Originality/value - The traditional set-up is compared with innovative laboratory configuration, which consists of three computer systems: the test system in the middle augmented by two systems on either side that function as the scenario presenter and the data collection system. The re-engineered usability-testing process streamlined usability experiments and reduced the task completion times.

Keywords Testing conditions, User studies, User interfaces, Multimedia

Paper type Research paper

Introduction

As computing technology infiltrates modern society, we interact with computers and electronic devices regularly. These devices are being used in almost every aspect of our daily lives, including communications, entertainment, education, marketing, research, health and medicine, etc. As society becomes dependent upon efficient and effective interaction with various computer/electronic equipment, it becomes essential that research and development caters for creating more usable/user-friendly interface systems for interacting with such equipment.

Where there is a demand for research in usability engineering to contribute towards developing usable systems, there is a further need to develop refined procedures and methods for carrying out efficient and effective usability studies. Traditional usability procedures (Pearrow, 2000) and laboratories are mainly configured for performing usability studies in non-real-time software applications, comprising stored applications that require less stringent constraints on delay and other timing factors. These applications include websites, mobile devices, database software systems, etc. Traditional usability laboratories do not cater for conducting usability studies in real-time software applications that require stringent constraints on delay and timing factors. These include multimedia applications such as video on demand, video conferencing, telephony conferencing, etc. (Sharda, 1999).

In using the traditional usability-testing process we identified factors that hinder conducting usability testing of real-time software applications. This led to the development of a new approach for usability testing, particularly suitable for real-time software systems. We explain the innovative process for data collection, scenario synchronisation/task scheduling and task completion time/error count logging based on our innovative usability laboratory configuration. section 2 includes an overview of usability testing. section 3 presents the traditional usability-testing process, and section 4 explains the re-engineered usability-testing process. sections 5 and 6 provide a conclusion and suggestions for further work, respectively.

Overview of usability testing

Usability testing involves measuring the quality of user experience in deploying a particular application, such as software applications, websites, electronic devices, mobile phones, etc. According to the International Standards Organisation (ISO), usability is the "effectiveness, efficiency and satisfaction with which a set of users can achieve a specified set of tasks in a particular environment" (Usability Net, 2003). Usability testing emerged during World War Two, when intensive research was carried out into the use of new technologies. It was discovered that through good interface design for new technology, the efficiency and performance for automated tasks increased. As technology advancements led to the computer and telecommunication age, the need for research into developing usable equipment emerged, and thus a usability revolution transpired (The Usability Company, 2004). As a result of this extensive research, usability engineering, testing, studies, methodologies, evaluation techniques and laboratories have surfaced and evolved to a degree where we have international standards for producing usable equipment. In the area of computer science, usability testing methodologies have been developed generally towards testing software applications and websites that operate in non-real-time environments. Current procedures in usability testing prove to be cumbersome and inefficient for usability studies in real-time systems.

Overview of our QoS research

We are currently developing a holistic three-layer QoS (TRAQS) model that defines three perspectives for the negotiation of QoS:

(1) user perspective;

(2) application perspective; and

(3) transmission perspective layer (Sharda and Georgievski, 2002).

The TRAQS model allows the user to specify the desired QoS by using a QoS parameter taxonomy and an application taxonomy (Georgievski and Sharda, 2003a). For the specification of the desired QoS a variety of graphical and physical interfaces can be used. In this research, combinations of various graphical user interface (GUI) elements and physical user interface (PUI) components are tested for the static[l] and dynamic[2] management of QoS (Georgievski and Sharda, 2003b). This research investigates how the user can interact with a multimedia application in real-time without interrupting the session, and the relationship between the GUI and PUI for such interactions. Similar research in usability testing includes the JUPITER2 project (Milszus, 1999).

In this paper, we use a specific usability experiment, namely "Experiment CTRL-Ia: Fundamental use of diverse combinations of PUI devices and GUI components for system control". This experiment aims to determine the most suitable PUI components for controlling various GUI elements. The participant interacted with a simulated interface (Figure 1) developed for this experiment. The user interface comprises every possible GUI element, such as push buttons, radio buttons, scrollbars, etc. A scenario of ten tasks was used that instructed participants to perform certain tasks using the simulated interface with a PUI device. Upon completion, the participant performed another scenario of tasks of the same nature with another PUI device (Georgievski and Sharda, 2003b). This was repeated for each PUI device. Data were collected via pre-experiment and post-experiment questionnaires. Error counts were recorded by the facilitator, and task completion times were recorded by the observer. We performed these experiments first by using the traditional usability-testing process and then followed by our re-engineered usability-testing process. To obtain a well-founded comparison, the same user (a third-year student with good technical skills currently studying a Computer Science Bachelor's degree at Victoria University) performed the experiment using the traditional and re-engineered usability-testing processes.

Traditional usability-testing process

The traditional usability-testing process involved using the traditional laboratory configuration (Figure 2), which includes a participant room adjacent to an observer room. The participant room comprises a series of computer workstations, and the observer room comprises a monitoring system (Nielsen, 1993; Microsoft Corporation, 2002). Our earlier trials using the traditional procedure to conduct this experiment are presented in the following subsections.

View Image - Figure 1.Simulated GUI interface for testing PUIs

Figure 1.Simulated GUI interface for testing PUIs

View Image - Figure 2.Traditional usability laboratory: (a) photograph of the participant room, (b) laboratory configuration

Figure 2.Traditional usability laboratory: (a) photograph of the participant room, (b) laboratory configuration

Laboratory configuration

We started with a laboratory configured using the traditional approach. In the observation room, we had an analog monitoring system comprising a four-channel video multiplexer giving four different camera views of the participant (Figure 3) side profile, front face, controller view and screen capture view.

The observation room was used to observe and record the camera views on a VCR and task completion times manually. The participant room comprised three test systems. The user performed the experiment on one of the test systems while reading the scenario from a hard copy. The user completed pre-experiment and post-experiment questionnaires on hard copies. The facilitator manually recorded the error count per task on paper. The laboratory configuration and the manual processes enabled us to successfully complete the experiments. However, we found that the users:

View Image - Figure 3.Monitoring system screen capture

Figure 3.Monitoring system screen capture

* became tired of having to read the scenarios from the hard copy; and

* found it difficult to keep track of numerous tasks, as each scenario comprised multiple tasks.

This adversely affected the task completion times. Recording the questionnaire responses was time-consuming and the users found it quite tiring to complete it after each scenario. The facilitator found it difficult to keep track of each task performed and error counts per task. The same applied to the observer, i.e. it was difficult to keep track of each activity and record task completion times in real time. This usability set-up proved to be inefficient and added unwelcome workload and delays in recording, transcribing and managing data.

Scenario synchronisation/task scheduling

Real-time recording of task completion times and error count per task hampered the synchronisation between the participant, facilitator and observer. It was difficult to keep track of each activity. This resulted from the fact that all scenario tasks were read from hard copy by each person, and furthermore, the observer and facilitator were required to record data on hard copy in real time. At times the participants became confused. Synchronising the start of the experiment and video recording became an issue, as the participants were not aware if recording was/was not taking place. It was difficult to randomise the scenario tasks to keep the user from remembering each task, which would affect the usability results.

Task completion time/error count logging

The facilitator and the observer were responsible for recording the error count and the task completion times, respectively. Recording results on hard copy and keeping track of the user activities produced many invalid results, which required reviewing the recorded data to validate all data. This also proved to be time-consuming.

Data collection

As all data was recorded on hard copy, it proved to be very time-consuming to collate and transcribe it later in Microsoft Excel for analysis. For this experiment, manually collating the results for five participants took a minimum of eight hours, and transcribing the data into Microsoft Excel was error-prone. Frequent verification and validation of all data was required to eliminate errors.

Consequently, carrying out collation and transcription of results for more people and for multiple experiments seemed cumbersome. Performing the entire experiment for one participant - which includes usability testing each device - and recording and collating the results required approximately ten hours. To perform this experiment using ten participants would take 100 hours and the results would be prone to errors. From this experience we realised that a more integrated and automated process was required.

Impact on experiment using the traditional usability-testing process

Using the traditional process for carrying out usability experiments proved to be error-prone, cumbersome and time-consuming. This affected the performance of the experiment personnel. It proved difficult to manage and validate the recorded data. Frequent reviewing of the video recording was required to confirm data validity. The overall impact of the traditional process was twofold:

(1) maintaining synchronisation between the task being carried out by the user and recording its data (by the observer and monitor) was very difficult; and

(2) collating the recorded data was time-consuming and error-prone.

The following sections expound our re-engineered usability-testing process, which improves on the traditional testing process.

Re-engineered usability-testing process

Our novel usability-testing process involved reorganising the usability laboratory and re-engineering the testing process. The observer and participant rooms remained the same; however, reconfiguring of hardware and software in the participant room was required.

Laboratory configuration

Our re-engineered testing process, in conjunction with the reorganised usability laboratory, enhanced the efficiency and effectiveness of the usability experiments. As shown in Figure 4, repositioning of equipment was carried out in the participant room. This new placement used test systems 2 and 3 for presenting the scenario and online recording of the participants' response to the questionnaires. This configuration includes the test system augmented by a scenario presentation system and a data collection system. The scenarios are presented as a Microsoft PowerPoint presentation on the scenario presentation system. This eliminates any confusion in keeping track of tasks completed by the participant, as the user steps through the scenario presentation step-by-step.

Each time the participant completes a task, he/she is required to click the space bar on the keyboard to display the next task. The data collection system stores and records the pre-experiment post-experiment questionnaire responses on the local hard disk. Automated backup software copies this data to the monitoring computer in real time.

The monitoring computer comprises an online timer (Figure 5). The observer is required to just click the mouse button each time a task is completed. This timer records the log files in a Microsoft Excel spreadsheet that enables easy data manipulation for further analysis. The observer is now able to remotely view the scenarios and tasks displayed on the scenario presentation system via the video monitoring system. This facilitates close synchronisation of experiment activities with the participant's actions. An automated process for task completion time logging is yet to be developed. The repositioning of the laboratory equipment enables the facilitator to remotely view the scenario displayed on the scenario presentation system. This made it easier for the facilitator to maintain synchronisation with the activities taking place inside the participant room.

View Image - Figure 4.New usability laboratory set-up: (a) photograph of the participant room, (b) laboratory configurationFigure 5.Task completion time logging screen

Figure 4.New usability laboratory set-up: (a) photograph of the participant room, (b) laboratory configurationFigure 5.Task completion time logging screen

Scenario synchronisation/task scheduling

For controlled usability experiments, scenarios are used to present a situation where appropriate actions are to be taken to complete a specific function. These actions in a scenario are expressed as a list of tasks. For this experiment, each scenario contained a series of ten tasks. Task scheduling involves the participant executing each task in the given order. Using the traditional approach, it was difficult to maintain synchronisation for each task as some tasks were completed very quickly by the - user. This did not give enough time for the observer and facilitator to record error counts and task completion times. This also disrupted the synchronising of activities between the participant, observer and facilitator. Our re-engineered process has enabled easy synchronisation of scenarios and task scheduling by streamlining the testing process. Storing data in electronic form facilitates the process of randomising the scenarios and tasks. Synchronising the starting of video recording with the starting of the experiment currently remains an issue. However, performing this task manually does not have any detrimental effect on the performance of the experiment. Nonetheless, automation of this function would enhance the testing process.

Task completion time logging

We have improved the process of real-time logging of the task completion times by installing an online timer on the monitoring computer. Storing the recorded times in Microsoft Excel spreadsheets makes it easy and efficient to collate the data and analyse the results. An improvement to task completion time logging would require a software application synchronised with the presentation on the scenario presentation system.

Data collection

We have enhanced the process of recording participants' responses through the use of the data collection system. On the data collection system, we have questionnaires loaded as Microsoft Excel spreadsheets. These spreadsheets are locked down, in that the participant can only alter information in unlocked fields. This process assists in the protection of data that should not be altered on the questionnaires. This process has proven to be very efficient and it facilitates the validation of data. This reduces the chances of making errors or losing data by eliminating the process of manually transcribing the data into Microsoft Excel. Using backup software that backs up the data in real time prevents data from being lost.

Using this novel approach to collate the data for multiple participants and experiments proves to be systematic and less time-consuming. To perform the entire experiment for one participant, which includes the usability test for each device, recording and collating the results takes approximately four hours. To perform this experiment using ten participants would take 40 hours, as against 100 hours for the traditional approach.

Impact on experiment using the re-engineered usability-testing process

Using our re-engineered approach to conduct the usability experiment proved to be quite efficient and efficacious. Reorganisation of the laboratory assisted in streamlining the process for reading the list of tasks in a scenario, performing the tasks on the test system, and finally completing the pre-experiment and post-experiment questionnaires. Reconfiguration of the hardware and software assisted the process of synchronising the activities between the facilitator, observer and participant. Storing data directly in electronic form and using a software system to back up the data in real time assisted the process of data collection, transcription, and management. Using the online timer and logging software for recording task completion times improved the process of maintaining accurate time logs and synchronising the activities between the observer and the participant.

In Figure 6, we present a comparison of the task completion times for a controlled experiment performed using the traditional approach and followed by our novel approach. In this experiment a participant performed a scenario of ten tasks using one PUI device (joystick).

The nature of each task was that the user performed a series of actions that required clicking on objects and navigating the GUI interface. Each task was equally "easy" to complete. We considered an "error" to be an action that the user performed that was not given in the scenario, or where the user made a mistake within the process followed for carrying out the experiment. This included executing instructions given in the scenario and filling in the feedback questionnaires.

As shown, the participant made frequent errors using the traditional approach and also took much longer to perform each task. It took 16 minutes and 53 seconds to complete the experiment using the traditional approach, and it took only 10 minutes and 54 seconds to complete the experiment using our re-engineered process.

Using the traditional approach, there was a total of ten errors made within the process of conducting the experiment, and a total of nine errors was made using our re-engineered process (Figure 7). In such a small-scale experiment, this may not be of high significance. However, when carrying out large-scale usability tests, minimising all possible errors and maintaining validity of data is of high significance.

As shown in Figure 7, using the traditional usability testing process two errors were made in task 3 and another two in task 4. No errors were made in tasks 5 and 6. Using the re-engineered usability-testing process, one error was made in task 5 and another error in task 6, and no errors were made in tasks 3 and 4. This proves to be inconclusive due to the fact that this experiment was performed on a small scale. However, we can deduce that the number of errors encountered in the process of conducting a usability experiment reflects on other external factors such as the experience, knowledge, skill, and psychological and physical state of the individuals participating and conducting the usability experiment.

View Image - Figure 6.Comparison of the task completion times for the traditional and re-engineered usability testing approach

Figure 6.Comparison of the task completion times for the traditional and re-engineered usability testing approach

View Image - Figure 7.Comparison of the error count for the traditional and re-engineered usability testing

Figure 7.Comparison of the error count for the traditional and re-engineered usability testing

An improvement to this process would be to develop software that monitors the participant's activities, compares them with the tasks presented in the scenario, and automatically logs the task completion time for each task. To improve the facilitator's duty of recording error counts, we recommend using a monitoring application stored on the test system that observes the activity performed, compares it with the tasks presented in the scenario, and automatically logs any errors.

Conclusion

We have presented an analysis of the traditional usability-testing process for carrying out usability tests and identified its shortcomings for testing real-time systems. We addressed the issues involved in using the traditional approach by providing a re-engineered usability-testing process. The traditional usability-testing process may be suitable for carrying out usability tests for non-real-time applications such as websites, mobile devices, and database software systems. However, for testing real-time systems, it proves to be inadequate. Our re-engineered process addresses these issues by improving data collection, scenario synchronisation/task scheduling and task completion time/error count logging as well as the laboratory set-up. Needless to say, even this new process has room for improvement.

Further work

We have further enhanced our re-engineered approach by restructuring the usability laboratory, which has enabled us to minimise the number of personnel required to carry out a usability experiment. This involved shifting the monitoring system into the participant room, where the facilitator has a clear view of the monitoring system and thus can also record task completion times on a notebook computer. This enhancement has proved to be viable in our testing environment, where it has enabled us to further streamline the process of conducting usability tests. Further testing, refinement and publication of results using this process is yet to be completed.

As a part of ongoing enhancements to our re-engineered usability-testing approach, we propose developing software that can automate the process of synchronising duties for the participant, the observer and the facilitator. This software system could perform error logging, task completion logging, video recording, and management (backup) of recorded results. We can foresee that such a system would enable usability testing to evolve into the next generation, where fewer personnel would be required to carry out the experiments.

This would lead to many benefits, such as minimising the costs of running usability tests, enabling more thorough testing of new technology, and thus producing better user-oriented technology and enhancing human computer interaction (HCI).

Sidebar
Footnote

Notes

1. "Static" QoS management means configuring the QoS in non-real time.

2. "Dynamic" QoS management means monitoring and controlling the QoS in real time.

References

References

Georgievski, M. and Sharda, N. (2003a), "A taxonomy of QoS parameters and applications for multimedia communications", paper presented at the International Conference on Internet and Multimedia Systems and Applications, MSA 2003, Waikiki, HI.

Georgievski, M. and Sharda, N. (2003b), "Usability testing for real-time QoS management", paper presented at the 24th IEEE International Real-Time Systems Symposium, RTSS2003, Cancun.

Microsoft Corporation (2002), "Usability research, usability labs", available at: www.microsoft. com/usability/tour.htm (accessed July 2004).

Milszus, W. (1999), "Comparison of usability testing methods", Project P807, JUPITER2 - Joint Usability, Performability and Interoperability Trials in Europe, European Institute for Research and Strategic Studies in Telecommunications, EURESCOM, Heidelberg, available at: www.eurescom.de/- public-webspace/P800-series/P807/index.html (accessed July 2004).

Nielsen, J. (1993), Usability Engineering, Academic Press, Boston, MA.

Pearrow, M. (2000), Web Site Usability Handbook, Charles River Media, Rockland, MA.

Sharda, N. (1999), Multimedia Information Networking, Prentice-Hall, Englewood Cliffs, NJ.

Sharda, N. and Georgievski, M. (2002), "A holistic quality of service model for multimedia communications", paper presented at the International Conference on Internet and Multimedia Systems and Applications, MSA2002, Kaua'i, HI.

(The) Usability Company (2004), "History of usability", available at: www.theusabilitycompany. com/resources/history.html (accessed July 2004).

Usability Net (2003), "International standards for HCI and usability", available at: www. usabilitynetorg/tools/r_international.htm (accessed July 2004).

AuthorAffiliation

M. Georgievski and N. Sharda

School of Computer Science and Mathematics, Victoria University, Melbourne, Australia

Copyright Emerald Group Publishing, Limited 2006