CMP2019M Human-Computer Interaction
Week 7 – Intro to Evaluation in HCI
Last Week
• What is Prototyping?
• Prototyping in Computer Science • Prototyping in HCI
• Mock exam
Reading
This Week
• Evaluation in Human-Computer Interaction
– Part 1: Background – Part 2: Methods
• Inspection vs. testing
• Qualitative vs. quantitative • Lab vs. field
– Part 3: Functionality and Experience – Part 4: Ethics in HCI
1. Background
Remember: HCI explores how people interact with technology.
How can we find out whether this interaction works out as intended?
Evaluation in HCI
• Put our systems to the test – evaluate them: a lot of work in HCI can only be tested in action, value emerges from interaction.
• Continuous process throughout development cycle – that is also why we need prototypes and iterative development!
• Design and evaluation are complementary aspects of HCI – one cannot stand without the other.
Evaluation Goals
• Understand whether systems work as intended for different user groups and in a range of settings – remember initial requirements establishment!
• Understand how systems can be improved, e.g., whether a certain feature adds benefits for users
• Understand the effects that a system may have on its users, e.g., their emotional state, or their attitudes – this can be really broad!
Example: Food App
Designing an app to support healthy eating…
1. Is the app usable on a very basic level? Can users access all of its functionalities?
2. Is it better to get users to take pictures of their meals rather than typing in the same information?
3. Does the app promote healthy eating? This is the single most difficult bit. It’s not just about statements and attitudes, but about behaviour!
Evaluation is a core aspect and the most powerful tool of HCI.
2. Methods
Mostly evolved from research methods in social sciences and engineering.
Long history in usability engineering, but increasing focus on user experience.
2.1 Inspection vs. testing
2.2 Qualitative vs. quantitative 2.3 Lab vs. field
2.1 Inspection vs. Testing
• Two fundamentally different approaches:
– Inspection relies on experts and heuristics / guidelines
(remember week 1 & 2)
– Testing relies on users (i.e. actual people)
• Both approaches come with advantages and disadvantages
• Can be applied at different points of the development cycle
2.1 Inspection: Overview
• Evaluation approach in Human-Computer Interaction that invites experts to comment on system, or that applies heuristics and guidelines which allow non-experts to function as such.
• Mainly applied in usability engineering.
• Think: Moving into off-campus housing. When looking at places, you can either bring a parent to tell you whether a place is okay, or you can google checklists and try to do the job yourself.
2.1 Inspection: Approaches
• A Heuristic Evaluation uses existing heuristics to inspect systems: it asks whether system complies with guidelines
• Carried out by a small number of persons who produce a written report of their inspection; can be members of the development team
• Examples of heuristics: Shneiderman’s Eight Golden Rules, Nielsen’s 10 Usability Heuristics
2.1 Inspection: Approaches
• Heuristic Evaluation ct’d
• “Is system status always visible?”
• “Does system give sufficient feedback?”
• “Does system produce adequate responses to user errors?”
• (You’ve done this before.)
• Assessment of critical aspects of interaction between user and system with the help of heuristics
2.1 Inspection: Approaches
• Heuristic Evaluation ct’d • Three main stages:
1. Briefing: Provide background information on system, introduce guidelines, potentially scoring system.
2. Inspection: Experts individually inspect system and take notes.
3. Debriefing: Team gets together and discusses findings, produces final report.
2.1 Inspection: Approaches
• A Cognitive Walkthrough invites experts to imagine how new users would interact with a system to identify potential interaction problems.
• Two main stages:
1. Setting an evaluation task – what steps would users
have to take to arrive at a certain goal?
2. Carrying out the task and taking notes with respect to sub-tasks, i.e., visibility and availability of correct action, and quality of feedback
2.1 Inspection: Benefits
• What are the benefits of an inspection? – Quick
– Cheap(ish)
– Easily repeatable
– Applicable at most stages of the dev process
– Easily translates into action items for developers
• Likely to identify the biggest (usability) issues that need to be addressed by developers and designers
2.1 Inspection: Challenges
• What are risks and challenges?
– Availability and affordability of experts
– Correct interpretation of guidelines
– Think back to workshops: What if heuristics / guidelines don’t apply?
– Think back to week 2: Designing for user groups with special needs – can be hard to anticipate!
• No insights into how actual users would interact with system – could miss important points!
2.1 Testing: Overview
• Alternative evaluation approach in Human- Computer Interaction that involves users in the testing of a system, and gathers their feedback to identify areas for improvement, validate features, or gain insights into how system affects them.
• Think: Baking cookies. When you’re done, you might wonder whether they’re any good, and you offer some to your roommates. One loves them, another thinks they’re okay, so you’re probably good.
2.1 Testing: Procedure
Research Study Data Data question design collection analysis
2.1 Testing: Example
• A/B testing: Testing two different versions of a system in a controlled environment
2.1 Testing: Example
• Data collection:
– During interaction: Observations and metrics
– Post-interaction: Gather feedback with standardized questionnaires (e.g., NASA-TLX, SUS, ISO)
• Data analysis and interpretation: – Statistical methods!
• More about this in Week 10: Quantitative Evaluation in HCI
2.1 Testing: Example
vs.
2.1 Testing: Benefits
• What are the benefits of testing?
– Actual feedback from real people
– Insights into how people interact with your system in the lab or field
– Scientific approach to evaluations in HCI • “Real-world” testing = higher validity!
2.1 Testing: Challenges
• What are risks and challenges?
– CONFOUNDS
– Requires planning, takes time!
– Recruiting the right sample can be difficult
– Additional knowledge necessary to analyse and interpret results – results need to be made actionable
– People tend to like what they know – what does that mean in terms of innovation and creativity?
• Requires basic knowledge in experimental design and data analysis, and takes time!
2.1 Inspection vs. Testing
• How to decide which method to use?
– What is the status of your system? Low-fi vs. hi-fi and
more polished prototypes.
– What questions are you trying to answer? Interface design considerations vs. user interaction.
– How much time do you have, what is your budget, and what is your geographical location?
• Methods can be complementary – try to run inspections before entering user testing stages!
Examples – Would you opt for an inspection or testing?
2.1 Inspection or Testing?
• Security check for online banking software
• What users want might not always be good for them!
2.1 Inspection or Testing?
• Healthy eating app for seniors
• Hard to predict needs, but also hard to recruit a representative sample…
2.1 Inspection or Testing?
• Online gambling websites
• Importance of usability and functionality – need to combine approaches?
2.1 Inspection or Testing?
• Your website / portfolio
• Does need some evaluation because it should look good. How much is too much?
2.1 Inspection or Testing?
• Fashion retailer website (online shop)
• Experts can tell you about functionality, but what about image and experience?
2.1 Inspection or Testing?
• Player input for a movement-based dance game
• How can suitability of movements be assessed?
2.1 Inspection or Testing?
• Group project – an app to help students stay organized.
• End users are easily accessible, so why not ask?
Inspections and testing can provide valuable insights into user interactions with systems.
Inspections are usually quicker, testing can provide deeper insights, and both can complement each other.
2.1 Academic Perspective
• Inspection is relevant in industry, but academic community requires more rigorous approach
• Working on a research project? Opt for testing to comply with current academic standards!
• Keep this in mind when thinking about your final year projects – research-focused HCI project needs solid evaluation (= user study)!
2.1 Inspection vs. testing
2.2 Qualitative vs. quantitative
2.3 Lab vs. field
2.2 Qualitative vs. Quantitative
• Evaluation methods that test systems with users can adopt qualitative or quantitative approaches (or combine both)
• As designer / researcher, you need to decide whether your goal is to explore a phenomenon, or if you are interested in large-scale effects
• Choice of approach often determines whether successful insights are gained…
2.2 Qualitative vs. Quantitative
• Qualitative research is interested in quality of phenomena rather than quantity and asks about the experience of the individual (remember week 6)
• Instruments:
– In-depth observations
– Focus groups
– Structured and unstructured interviews
• Analysis:
– Thematic Analysis, Grounded Theory, …
2.2 Qualitative vs. Quantitative
• Qualitative research usually works with smaller number of participants
• Data analysis requires effort – although it is “just” reading, making sense of data can be tricky
• Challenge: Identifying dominant themes
• Benefit: Thorough understanding of phenomenon
2.2 Qualitative vs. Quantitative
• Quantitative research is interested in quantity of phenomena and asks about the experience of many, i.e., average user (more in week 10)
• Instruments:
– Questionnaires – Metrics
• Analysis:
– Statistical data analysis (e.g., working with means and variance, inferential testing to understand data)
2.2 Qualitative vs. Quantitative
• Quantitative research works with bigger user groups
• Data analysis is relatively quick, but requires
statistical knowledge
• Challenge: Choosing the right instruments
(questionnaires etc.) and getting data analysis right
• Benefit: Thorough understanding of the bigger picture
Examples – Would you opt for a qualitative or quantitative approach?
1. Two smartphone interfaces.
1. Two smartphone interfaces. 2. Online bereavement support.
1. Two smartphone interfaces. 2. Online bereavement support. 3. Multiplayer game for girls.
1. Two smartphone interfaces. 2. Online bereavement support. 3. Multiplayer game for girls.
4. New interface for blackboard.
1. Two smartphone interfaces. 2. Online bereavement support. 3. Multiplayer game for girls.
4. New interface for blackboard. 5. Dating app.
1. Two smartphone interfaces. 2. Online bereavement support. 3. Multiplayer game for girls.
4. New interface for blackboard. 5. Dating app.
6. City council website.
1. Two smartphone interfaces. 2. Online bereavement support. 3. Multiplayer game for girls.
4. New interface for blackboard. 5. Dating app.
6. City council website.
7. Games for humans and cats.
Qualitative – in-depth understanding of small group of users, exploring phenomena.
Quantitative – in-depth under- standing of the average user, validating effects.
2.1 Inspection vs. testing
2.2 Qualitative vs. quantitative 2.3 Lab vs. field
2.3 Lab vs. Field
• Evaluation methods that test systems with users can be carried out in the lab or the field.
• Lab = any kind of controlled environment that you create to test your system, e.g., a room on campus, a coffee shop that you meet your testers at, …
• Field = the “natural habitat” of your users, e.g., at the office, at home, on the train, …
2.3 Lab vs. Field
• Why worry about the differences?
• Each of the environments provides different insights
• Each comes with specific limitations
• Depending on the question that you’re trying to answer, one or the other may be more suited
2.3 Lab vs. Field
• Advantages of testing in the lab: – Controlled environment
– Maximum surveillance potential
• Disadvantages of testing in the lab:
– Validity – it is an artificial environment…
– Longitudinal evaluations tiresome for users
– Requires testing facilities (or potential confounds)
2.3 Lab vs. Field
• Advantages of testing in the field:
– Natural habitat = ecological validity! – Easier longitudinal deployment
• Disadvantages of testing in the field: – Confounds
– User feedback may be more difficult to obtain – Methodologically challenging
2.3 Lab vs. Field
• Which approach should you choose?
– Depends on the question you are trying to answer
(again…)
• User group you’re working with
• Type of system you’re developing
– Depends on availability of resources
• Utopia: Combine both approaches to get maximum amount of insight
Can you think of a system that you’d have to test in the lab?
Can you think of a system that you’d have to test in the field?
Lab = controlled but not necessarily realistic environment.
Field = realistic but not necessarily easy to research environment.
3. Functionality and Experience
3. Functionality and Experience • Functionality = Does system enable the user to
work efficiently and effectively? Is it inclusive?
• User Experience (UX) = What is the subjective experience when interacting with the system?
• Human Factors and Ergonomics vs. a more holistic approach in Human-Computer Interaction
• Is interaction pleasurable?
What might be advantages of designing for a positive
user experience?
Whether users engage with technology is not just about functionality, but also about hedonic aspects.
How does interaction feel? How do we see ourselves using a system, and what image does it project?
3. Functionality and Experience
• Evaluations of user experience ask questions beyond usability and accessibility
• Stronger link with psychology (and sometimes marketing and market research)
• Goal: Assessing the overall experience that users have when interacting with a system or technology – what emotions does interaction stimulate, and how do users feel about themselves and the system?
3. Functionality and Experience
• User Experience can be assessed through evaluations
• UX methods are closer to psychology, usually user testing through experimental setting, e.g., scales to measure affect
• Approaches: Standardized questionnaires, physiological measures, but also interviews etc. – more when we talk about quantitative methods in HCI (week 10)
Evaluations can help you understand users and what drives them (and not just assess system quality).
But there’s a downside…
4. Ethics in HCI
Ethics in HCI
• If you are working with people, you need to consider their needs throughout the research process
• Ethical responsibility
– Submit forms to ethics committee
– Ensure legals are in place if working in industry – Informed consent
– Right to withdraw
– Anonymizing data
• Try to anticipate issues that might emerge…
Ethics – Historically
• Stanford Prison Experiment (Zimbardo et al., 1971)
– Recruited participants into fake prison
– Randomly assigned them to prisoner / guard roles
– People’s behaviour got out of hand – harrassment, violence, psychological symptoms of extreme stress
– Study was interrupted
– Twelve month follow-up still showed psychological impact on participants
Ethics – Historically
• Milgram Shock Experiment (Milgram, 1961)
– How do people react to authorities?
– People were recruited to be assistants on learning study where people had to combine words
– Investigator asked them to question (invisible) student, and decide on punishment if necessary
– Participants could (and did) administer shocks up to lethal levels despite hearing (fake) screams of student
– TV show “Le Jeu de la Mort”
Ethics in HCI
• These are extremes – but examples of what happens when researchers do not consider ethical implications, or get carried away
• HCI is generally less likely to be harmful at this level, but…
Ethics – Facebook Study
• Kramer et al. (2014)
– How are people’s emotions influenced by what is shown on their facebook timeline? How do we react to excessively positive content that friends post – does it make us feel down?
– Manipulated information without letting users know
– Willingly accepted that people might feel down
– http://www.theguardian.com/technology/2014/jun/30/ facebook-emotion-study-breached-ethical-guidelines- researchers-say
Ethics – Quantified Toilets
• Experiment at CHI 2014 conference in Toronto
– How do people react to quantification of unusual
information? – Result of Critical Making Hackathon
– Fake claim that urine disposed in conference toilets would be screened for STIs etc.
– “Real-time” information on people who went to the loo
– What if seemingly identifying information is posted,
e.g., female, pregnant, chlamydia?
– http://quantifiedtoilets.com/
Ethics in HCI
• If you are working with people, you need to consider their needs throughout the research process
• Try to anticipate issues that might emerge – can’t always do that, that’s why we have ethics boards as a back-up
• Make sure you have gathered (informed) consent • Important for your final year project!
Back to the beginning…
Example: Food App
Designing an app to support healthy eating…
1. Is the app usable on a very basic level? Can users access all of its functionalities?
2. Is it better to get users to take pictures of their meals rather than typing in the same information?
3. Does the app promote healthy eating? This is the single most difficult bit. It’s not just about statements and attitudes, but about behaviour!
Human-Computer Interaction looks at the design and evaluation of how people interact with technology.
Range of evaluation approaches is available – remember inspection vs. testing, and different kinds of tests and experiments.
Choice of approach & methods is an important step towards gaining valuable insights and successfully feeding results back into development cycle!
Do you have any questions?