程序代写 EECS 485 Lecture 21

EECS 485 Lecture 21
Accessibility,
Recommender Systems

Copyright By PowCoder代写 加微信 powcoder

Learning Objectives
• Identify ways websites can be more accessible or less accessible to diverse users
• Describe techniques for user-based collaborative filtering, which recommend items enjoyed by users who are similar
• Use content-based filtering to recommend items similar to items a user have enjoyed in the past

Accessibility

Say hi to partner
• Choose Partner 1 and Partner 2

Screen Reader and Web Pages
• Partner 1: describe the page and most relevant information
• Partner 2: make a drawing and understand what the page is saying
• can’t ask questions

Demo with Narrator
• Does the automatic narrator do as well as the human did?
• How does the narrator app know what order to go in and what is important?

HTML tags and semantics

tag does a lot of things • Makes text bigger
• Allows you to style with CSS • Marks text as a heading
Picnic Don’t forget to

bring a blanket

Can your website be zoomed in?
• Wikipedia
• Websites that want to behave like phone apps and disable zooming

Experience with computers

Download size
• Not everyone has unlimited fast data plans • Canada: data costs 2x than in US
• Many countries mobile data more popular than wired data
• Can take a long time to load a page

Making Recommendations

• Movies/shows (e.g., Netflix, Amazon, Hulu, etc.)

• Music (e.g., Spotify, , etc.)

• Products (e.g., Amazon, etc.)

• More examples • Movies
• Products
• Video games • Colleagues

Recommendation challenges
• Millions of products
• Hundreds of millions of users
• Little data
• Users have very little tolerance for adding preference data • System knows almost nothing about users

Data collection
• Explicit data collection
• Ask users to rate movies, products, whatever • Ask users for demographic information
• Implicit data collection
• Web logs of past user activity
• Timing information, e.g., how long you paused on a post before scrolling

Data collection
• How to predict what movies you like? Features? • One approach: do it like Web pages
• Collect data on my movie likes
The Godfather
Casablanca
Love and Death
• Collect features: genre, length, year, etc.
• Build score-predictor; recommend high-scorers
• Problems?

Traditional ML Doesn’t Work
• Unfortunately, film-qualities (features) may be difficult to extract
• Problem: How to recommend movies without knowing anything about movies?
• Collaborative Filtering
• Content-Based Filtering

Collaborative Filtering

Collaborative filtering
• Problem: How to recommend movies without knowing anything about movies?
• Solution: Recommend movies enjoyed by people who are similar to you
• People who agreed in the past will agree in the future

How could you fill in the blanks?

• Filling in missing scores
• Average for each item over all users

Averaging example

• Problem: Averaging ignores uniqueness of a user • Terrible when there is large variation in interest
• Movies, music are a few examples
• Solution: Nearest neighbor algorithm

Nearest neighbor algorithm
• Find another user with similar properties • Movie ratings in this example
• Use other user’s rating to “fill in the blank”
• People who agreed in the past will agree in the future

Nearest neighbor example
• Who are the nearest neighbors?

Nearest neighbor diagram
• We’ll use only two movies to make it easier to draw • Use the dimensions we do have information about
Youngblood

Distance metric
• How to measure distance between neighbors?
• Many possibilities, including • Euclidean distance
• Cosine similarity
• Pearson correlation coefficient
• A lot in common with document vector model

Euclidean distance
• Each movie rated by both users is a dimension
• One user is one vector in multidimensional space • Measure distance between the two vectors

Cosine similarity
• Each movie rated by both users is a dimension
• One user is one vector in multidimensional space
• Cosine similarity = 1
• Vectors “pointed in the same direction”
• Cosine similarity = 0 • Vectors orthogonal
• Cosine similarity doesn’t consider the length of the vector, only the angle

Pearson correlation coefficient
• Each movie rated by both users is a point • Value between +1 and −1
User Alice
Rating of one movie by Bob and Alice

K-nearest neighbors (k-NN)
• Can we do better than selecting just one nearest neighbor?
• Select several.
1. Find the k closest users
2. Select the most frequent score (mode)

k-NN problems
1. Find the k closest users
2. Select the most frequent score
• How would a really popular film affect recommendations?

k-NN problems
1. Find the k closest users
2. Select the most frequent score
• How would a really popular film affect recommendations?
• The popular film will be recommended most of the time because it will often be the most frequent score
• The popular choice might not always be the best choice for a user

k-NN with weights
• Solution: weight other users’ scores by similarity ru,i =k åsim(u,v)rv,i
vÎTop-Sim(u)
• Top-Sim(u) is the n most-similar user neighbors to u
• k is a normalizer
k =1/ åsim(u,v) vÎTop-Sim(u)

k-NN problems
• How would a “cult classic” film affect recommendations?
• Enjoyed by few, but they REALLY like it
• Rare film less likely to be predicted by k-NN
• Can use inverse-user-frequency
• Similar to TFxIDF’s inverse document frequency • Rarely seen movies are more influential

User-based collaborative filtering weaknesses
• Cold start
• Need user data before you can start making accurate
recommendations
• Scalability
• Millions of users, n-choose-2 pairs
• Large amount of computation power
• Sparsity
• Most users will only have rated a small subset of the
overall database
• Even the most popular items have very few ratings

Content-Based Filtering

Motivation
• Problems with user-based collaborative filtering • Cold start, scalability, sparsity
• Solution: focus on content instead of users • Avoid nearest-neighbor operations on users

Content-based filtering
• Content-based filtering: Recommend items similar to items the user has liked in the past
• People will like the same things in the future that they liked in the past

Content-based filtering synonyms
• Content-based filtering
• AKA Content-based collaborative filtering • AKA Item-based collaborative filtering
• AKA “Users who bought x also bought y”

Content-based filtering example
Decade Genre
W. 90’s Xanadu 80’s Youngblood 80’s Zorro 90’s
Biography Romance Romance Action

Content-based filtering
• How can we compute item similarity?
• Similar techniques to user-based collaborative filtering • Euclidean distance, cosine similarity, correlation, tf-idf

Content-based filtering weaknesses
• Problem: Website doesn’t know all your preferences • You’ve rated horror movies on Netflix, but you also like
• Problem: Lost opportunity to introduce you to new content
• Solution: Hybrid filtering

Hybrid filtering
• Hybrid filtering: Combine user-based and content- based filtering
• Netflix Example:
• Recommendations based on similar users; AND
• Recommendations based on shows user has rated highly

Learning Objectives
• Identify ways websites can be more accessible or less accessible to diverse users
• Describe techniques for user-based collaborative filtering, which recommend items enjoyed by users who are similar
• Use content-based filtering to recommend items similar to items a user have enjoyed in the past

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com