代写代考 COMP3074-HAI Lecture 13, Conversational UI

HAI-Lecture13

Conversational and
Voice User Interfaces

Copyright By PowCoder代写 加微信 powcoder

Introduction

Human-AI Interaction

Lecture 13

§Module progress – interim report

§ Introduction to Conversational UI/UX

§Socio-economic Challenges

This lecture

COMP3074-HAI Lecture 13, Conversational UI

Part 1. Module progress

§Foundations of NLP
§Practical NLP techniques, processes, applications
§Application of these techniques in labs and coursework 1 – due

next week! Straw poll: who is already working on CW1?
§ Learning outcomes: You should now know how to implement a

basic but functional NLP pipeline to drive an interactive NLP-
based system, e.g., a ‘chatbot’

§Theoretical and critical reflection on AI and NLP in the real world
§ Learning outcomes: appreciate the ethical, societal, and social

complexities involved in design and use of NLP applications

Part 1 of the module is complete

COMP3074-HAI Lecture 13, Conversational UI

§Critical issues in research,
design, development and
deployment of AI-driven
systems, e.g.,
§Ethical issues
§Filter bubbles
§ Interpretable AI

Theory, Concepts Practice

§First half of the term
§Natural Language

Processing
§Second half of the term

§VUI design

5COMP3074-HAI

Assessed by quizzes (30%)
Lecture 13, Conversational UI

§Practical development of NLP
§Python / nltk
§Concludes with CW1

NLP – 1st half of term VUI design – 2nd half

§Practical design, testing of a
VUI prototype
§Voiceflow
§Concludes with CW2

6COMP3074-HAI Lecture 13, Conversational UI

Week Theory lecture Practical lecture Labs Assessment
9 Conversational and Voice

Basic VUI design
principles

Coursework 1 Quiz 2 (10%)

10 Automatic Speech
Recognition

Advanced VUI design Coursework 2
release – Voiceflow

CW1 due (40%)

11 Discoverability and
Response Design

User Testing for VUIs Voiceflow

12 Progressivity for VUIs TBC Voiceflow Quiz 3 (10%)
13 TBC TBC Coursework 2
14-16 CHRISTMAS BREAK 🥳 🤩 🥰

17 CW Q&A CW2 due (30%)

Schedule – Pt. 2

COMP3074-HAI Lecture 13, Conversational UI

§Goal is to learn how to build the voice user interface, the ‘cockpit’ to
our NLP-engine, including learning
§Practical VUI design principles and techniques that help

understand, design, build, and evaluate VUIs with real people
§Application of these principles and techniques in labs and

coursework
§Theoretical and critical reflection on VUIs in use in the real

Part 2 of the module

COMP3074-HAI Lecture 13, Conversational UI

Part 2. Conversational UI/UX

What are examples of conversational systems you have
used? What worked well, what didn’t? Discuss for 2 mins.

§CUI = Conversational User Interface
§Can refer to text-based ‘chatbots’
§Or voice-based VUIs (Voice User Interfaces)
§Or hybrid versions of GUI + voice (e.g., SIRI)

§UX = User Experience
§Broader term encompassing experience of the

interaction, not just design of interfaces and
information

§UX Design is an industry profession (web / app /
interface / game / graphic / designers)

§HCI research – largely academic discipline

Conversational UI/UX

COMP3074-HAI Lecture 13, Conversational UI

§ Robotics
§ Cobots, Social /
Companion robots,
domestic robots

§ Smart speakers / Home
§ “personal assistants”
§ shopping, entertainment

§ Mobility / transportation
§ Safety critical
§ Hands free
environments

The many faces of conversational interfaces J

COMP3074-HAI Lecture 13, Conversational UI

The smart speaker market

COMP3074-HAI Lecture 13, Conversational UI

§ 29% of Brits own a smartspeaker (Sept, 2020)
§ 4-fold growth since 2017
§ Second most popular smart home device

(49% have a smart TV)
§ Source: https://mobilemarketingmagazine.com/uk-

smart-speaker-ownership-2020-gfk-techuk

§ In the US it’s similar, 32%
https://voicebot.ai/2020/04/28/nearly-90-million-u-s-
adults-have-smart-speakers-adoption-now-exceeds-one-
third-of-consumers/

§ Unsurprisingly, it’s quite a lot lower in countries
in which English is not the first language

Adoption varies

COMP3074-HAI Lecture 13, Conversational UI

https://mobilemarketingmagazine.com/uk-smart-speaker-ownership-2020-gfk-techuk

Nearly 90 Million U.S. Adults Have Smart Speakers, Adoption Now Exceeds One-Third of Consumers

§ In late 2021, 39% of adults in England owned
and used at home a Voice-activated personal
assistant or smart speaker device, according to
a survey commissioned by the UK gov’t

(https://www.gov.uk/government/statistics/participation-survey-october-to-december-2021-
report/participation-survey-october-to-december-2021-main-report#background)

§ In the US, ~35% of the population own smart
speakers in early 2022

§ Some say that the market may be becoming
stagnant (rises decreasing)

2022 smart speaker market

COMP3074-HAI Lecture 13, Conversational UI

https://www.gov.uk/government/statistics/participation-survey-october-to-december-2021-report/participation-survey-october-to-december-2021-main-report

VUI interaction – an example

§ Example: https://www.youtube.com/watch?v=IRmGZSdH2qY
User: Alexa remind me to fix my clock.

Alexa: I put fix my clock on your todo list.

What is happening between the request and the response?
Discuss for 3 mins, share with the class.

Technically, what’s going on?

COMP3074-HAI Lecture 13, Conversational UI

§ Example: https://www.youtube.com/watch?v=IRmGZSdH2qY
§User: “Alexa remind me to fix my clock”

§ wakeword detection + acoustic model-based transcription + confidence level

§ Keyword recognition + intent matching [TODO list] + variable/slot
§ Dialogue management + TTS vocalization

§ Response generation
§Alexa: “I put fix my clock on your todo list.”

Technically, what’s going on?

COMP3074-HAI Lecture 13, Conversational UI

action/intent Variable/slot

§As with other AI-driven technologies, there are a host of problems
§Voice is a profoundly personal trait
§Reflects where you’re from, your mother tongue, dialect and

accent, sex/gender, sexual orientation, your socio-economic
background (e.g., ethnicity and education)

§ Issues around privacy and surveillance
§Pertinent given the often intimate / home environment in which

VUIs are placed

A host of challenges

COMP3074-HAI Lecture 13, Conversational UI

Part 3. Socio-economic challenges

§Language support
§There are 7,000+ or so actively spoken languages

https://www.ethnologue.com/guides/how-many-languages

§ 91 of them have at least 10M speakers
https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers

§As of late 2022, Amazon Alexa supports
§ 8 languages (English, French, Spanish, German, Hindi,

Italian, Japanese, Portuguese) ,
§ and dialects in 3 languages (English, French, Spanish)

https://www.globalme.net/blog/language-support-voice-assistants-compared/

VUI challenges…there are many!

COMP3074-HAI Lecture 13, Conversational UI

https://www.ethnologue.com/guides/how-many-languages
https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers
https://www.globalme.net/blog/language-support-voice-assistants-compared/

§Dialects, accents and pitch
§ Tatman, Rachael. (2017, April). Gender and dialect bias in

YouTube’s automatic captions. In Proceedings of the First
ACL Workshop on Ethics in Natural Language
Processing (pp. 53-59).

§VUIs struggle more with certain
§Worst for people from Scotland

§Accuracy also lower for higher
pitched voices
§Worse for women (and probably

children) à gender and ageist bias

VUI challenges…there are many!

COMP3074-HAI Lecture 13, Conversational UI

§ Racial bias
§ et al. (2020). Racial disparities in automated

speech recognition. PNAS April 7, 2020 117 (14) 7684-7689; first
published March 23, 2020; https://doi.org/10.1073/pnas.1915768117

§ Examined 5 state-of-the-art ASR systems—developed
by Amazon, Apple, Google, IBM, and Microsoft
§ To transcribe structured interviews conducted

with 42 white speakers and 73 black speakers
§ found that all five ASR systems exhibited substantial

racial disparities
§ an average word error rate (WER) of 0.35 for black

speakers compared with 0.19 for white speakers.
§ WER standard measure of discrepancy between

machine and human transcription, based on
substitutions, insertions, deletions.

§ Authors trace these disparities to the underlying
acoustic models used by the ASR systems

VUI challenges…there are many!

WER by % of audio snippets.
Assuming a WER of >0.5 implies transcript
is unusable, then 23% of transcribed audio
snippets of black speakers unusable, whereas
only 1.6% of audio snippets of white speakers
result in unusable transcripts.

https://doi.org/10.1073/pnas.1915768117

§Sociophonetics
§Phonetics = study of the production and perception of spoken

§ Selina Jeanne Sutton, , , and . 2019. Voice as a Design Material:

Sociophonetic Inspired Design Strategies in Human-Computer Interaction. In Proceedings of the 2019 CHI
Conference on Human Factors in Computing Systems (CHI ’19). Association for Computing Machinery,
, NY, USA, Paper 603, 1–14. DOI:https://doi.org/10.1145/3290605.3300833

§Study of the social factors influencing production and perception
of speech, shaping sociocultural identities
§Accent and voice quality influenced by, e.g.,

Geography, sex and gender, age, sexuality, social class
§VUIs synthesised speech generally represents a homogenous,

mainstream accent – lack of diversity in what this voice represents

VUI challenges…there are many!

COMP3074-HAI Lecture 13, Conversational UI

§ Gendering synthetic speech
§ Why are voice assistants generally given a female voice?
§ Rausch (Amazon) said in trials user preferred female voices

§ Echoing earlier research that found people find female voices more
agreeable, pleasant (e.g., Mitchell et al., 2011)
http://macdorman.com/kfm/writings/pubs/Mitchell2010DoesSocialDesirabilityBiasFavorHumans.pdf

§ BUT, gendering is embedding gender stereotypes
§ UNESCO report “I’d blush if I could’ criticizes ‘the female servant’ /

‘personal assistant’ stereotype https://unesdoc.unesco.org/ark:/48223/pf0000367416.page=1
§ Does it invite sexualized, gendered language? (Woods, 2018)

https://doi.org/10.1080/15295036.2018.1488082

§ Gender neutral voices, e.g., EqualAI’s ’Q’

VUI challenges…there are many!

COMP3074-HAI Lecture 13, Conversational UI

http://macdorman.com/kfm/writings/pubs/Mitchell2010DoesSocialDesirabilityBiasFavorHumans.pdf
https://unesdoc.unesco.org/ark:/48223/pf0000367416.page=1
https://doi.org/10.1080/15295036.2018.1488082

§Surveillance and privacy intrusion

VUI challenges…there are many!

COMP3074-HAI Lecture 13, Conversational UI

§ Alleged mistreatment / exploitation of workers
§ Workers producing linguistic data sets
§ Not Google employees, but underpaid

subcontractors, “routinely pressured to work
unpaid overtime”

§ Case of Pygmalion, company creates training
data for Google’s Neural Networks

§ Annotation work, manual Part-Of-Speech
tagging to allow Assistant to produce high quality
answers https://www.wired.com/2016/11/googles-search-engine-can-now-
answer-questions-human-help/

§ Vast amounts of human labour involved in ASR /

VUI challenges…there are many!

COMP3074-HAI Lecture 13, Conversational UI

Quiz on Friday

CW1 due next week

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com