Web design (7): Evaluation


A collaborative medium, a place where we all meet and read and write.
Tim Berners-Lee

[Part 7 of 7 : 0) intro, 1) story, 2) pictures,  3) users, 4) content, 5) structure, 6) social media, 7) evaluation]

Even though evaluation is the final part of this series, it should not be left to the end of any software project. Ideally, evaluation should be used throughout the life cycle of a project in order to assess the design and user experience, and to test system functionality and whether it meets user requirements without creating unexpected results or confusion.

Expert analysis

Expert (or Theoretical) analysis uses a detailed description of the design, which doesn’t have to be implemented. This creates a model of the user’s activity and then analysis is performed on that model.

It is one way of assessing whether a design has good usability principles. It cannot guarantee anything but can hopefully flag up any design flaws before time and money gets spent on implementation.

Expert analysis is best used during the design phase and experts can assess systems using:

Heuristics which are rules of thumb and not true usability guidelines. Usability expert Jakob Nielson developed 10 usability heuristics in 1995 and they are still widely used and quoted today.  Design consultant, Ari Weissman says that heuristics are better than no testing at all, but to say that they can replace getting to know your users and understanding them just silly. Researchers at the University of Nebraska found that heuristic evaluation and user testing complement each other and are both needed.

Review-based evaluation uses principles from experimental psychology and human-computer interaction (HCI) literature to provide evaluation criteria such as menu design, command names, icons and memory attributes to support/refute design decisions. Reviews may even use style guidelines provided by big companies such as Microsoft and Apple.

Model-based evaluation uses a model to evaluate software. This model might be taken from HCI literature such as Stuart Card’s GOMS and Ben Shneiderman’s Eight golden rules of dialog design.

Cognitive walkthroughs are step-by-step inspections which concentrate on what the user is thinking whilst learning to use the system. Alas, it is the analysts who act as the user and try to imitate what the user is thinking. Walkthroughs can be used to help develop user personas.

However, the main criticism is that novice users are often forgotten about because analysts have lots of experience and their pretending to be users can introduce all sorts of bias into your system. The advantages of this approach is that areas which are unclear in the system design can be easily flagged up and fixed cheaply and earlier on in the life cycle.

Using your user: user testing

The most informative types of evaluation always take place with the user. This can happen in the laboratory or in the field. In the laboratory, usability consultants have a script, such as this one by usability expert Steve Krug. The usability consultant asks the user to either do whatever they are drawn to do, or to perform a specific task,such as buying a product on the site, whilst talking aloud. This thinking aloud protocol not only identifies what the problem is, but also why. The best thing about usability testing is that clients can hear a user saying something which may be obvious to the consultant but not to the client and which the client might not believe if the consultant just told them. Co-operative evaluation is a very similar technique to usability testing.

Outside the laboratory, you can follow the user about and shadow them in the workplace, to see how the user interacts with your software, or the current software that your new software will hopefully improve upon. This is ethnography and a way of learning about the context in which your users work. It can be very expensive and time consuming to hire ethnographers to go into users’ workplaces.

A cheap and cheerful way of reproducing this shadowing is to get the users to keep a diary or blog, known as a cultural probe.  They are quick and easy to put together using open-ended questions which encourage users to say all the things they might not say during a testing session.

Empirical evaluation

Another relatively cheap and cheerful method is to get your user group to fill out a questionnaire or a survey in order to get their feedback.

The questionnaire needs to be designed very carefully, following these instructions, otherwise you can end up with a lot of information, but nothing tangible. The main advantage is that you get your users opinions and you can measure user satisfaction quite easily.

The disadvantage is it that is hard to capture certain types of information in a questionnaire such as the frequency of a system error, or the time taken to complete a task.


Computers can collect statistics of use, to tackle the sorts of questions like time taken and frequency of system errors.  Web stats are a great way of seeing this sort of information as well as which pages are the most attractive and most useful to users.  Eye-tracking software and click captures are also useful ways of collecting data. However, care needs to be taken not to introduce any bias in the interpretation of this data.

Informal evaluation

Informal evaluation methods can be useful, in the design stage for example, but are better suited in the context of performing research as they do not always yield usable results which can be used to guide design.

Focus groups: This is when you get a group of users together and they discuss subjects led by a moderator. Focus groups can be useful. However, they can lead to users telling you what they think they want, rather than what they need. As this 2002 paper asks: Are focus groups a wealth of information or a waste of resources?

Controlled experiments test a hypothesis like this great example: College students (population) type (task) faster (measurement) using iPad’s keyboard (feature) than using Kindle’s keyboard, by identifying independent and dependent variables that you can collect data on after testing in a simulation of real world situations such as in a college where iPads and Kindles are used.

No matter how great your website or software system is, it can always be improved by some method of evaluation. There are many methods involving users and experts to make your system as good as it can be throughout the whole lifecycle of your website or your software. Evaluation is the only way to identify and correct those design flaws.