When conducting a usability lab, the garbage in, garbage out rule applies. It goes without saying that standard lab protocol should be followed, but beyond that its important to define for every scenario - in advance - what success and failure are. For instance,
There are no "right" answers, it's something that needs to be discussed amongst the project team, but it is important to know how you are going to classify each of these scenarios because it will ultimately affect your statistics.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
The practice of usability is the practice of predicting user behavior. Creating recommendations for improvement based on observations of users in the lab is the application of past experience (lab observations) to improve future interaction (how the same task will be completed by the user population at large). But prediction is risky and all sorts of things come into play when making conclusions based on a (small) sample set of data. In fact, the smaller the sample the greater the chance that the observed results don't accurately reflect the true population.
We conduct research on sample populations because testing the entire population of users is prohibitively expensive if not practically impossible. But consider this scenario. In our example, 9 of the sample participants successfully completed the task and 1 participant failed. Let's pretend that we test all users and find that out of the entire population 25 users pass the scenario and 4 fail.
Our sample gives us a completion rate of 90% but the ACTUAL completion rate is a significantly smaller 84%.
Why the discrepancy? Well there are a few things that could account for the difference, namely (1) sampling error or (2) bias. If we know there's a sampling error or bias we can correct for it. More vexing are the errors we don't know about. We can't correct for what we don't know, but we can't ignore it either.
So the way we quantify the limits of our knowledge is to be transparent and precise about what our statistics mean.
To be complete, statistical conclusions should always include:
Overwhelmed already? Don't worry; the rest of this tutorial walks through the entire process step-by-step.