Pattern Analysis & Machine Intelligence Research Group
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
1
Mining Rule Association
Material in this section is based on the following
references
1. Margaret Dunham, Data Mining Introductory
and Advanced Topics, Prentice Hall, 2003.
2. Jiawei Han, Micheline Kamber & Jian Pei, Data
Mining: Concepts and Techniques, 3rd ed,
Morgan Kaufmann Publishers, May 2011P.
3. Tan, M. Steinbach ad V. Kumar, Introduction to
Data Mining, Addison Wesley, 2006.
4. Rakesh Agrawal and Ramakrishnan Srikant,
Fast algorithms for mining association rules in
large databases, Proceedings of the 20th
International Conference on Very Large Data
Bases, VLDB, pages 487-499,1994.
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Assume we have the following transaction data set
Example: Market Basket Transactions
5 items (shirt, trouser, tie, socks, belt)
Possible itemsets 25 −1=31
Transaction Items
t1
shirt, trouser, tie
t2
shirt, tie
t3 trouser, socks, tie
t4
trouser, belt
t5 shirt, trouser
t6 shirt, socks
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
• Information to be gathered from datasets,
databases
• Finding relationships between items in a dataset
• Finding associations between items that
occur typically together
Association Rules
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
For transaction datasets, the rules can be used for:
a) Organization of items (e.g. in a market – shirts &
ties, trousers, belts)
b) Prediction of the sales and inventory (shopping
behaviour)
c) Advertising or promotion
d) Recommendation system (Amazon customers who
bought x also bought y…)
For monitoring datasets, the rules can be used to predict
faults or load on the system etc. (news, movies, CD’s)
Applications of Using These Rules
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Formulation of the Problem
5
Given a dataset of transactions
𝐷𝐷 = {𝑡𝑡1, 𝑡𝑡2, … , 𝑡𝑡𝑛𝑛}
Where each
𝑡𝑡𝑖𝑖 = {𝐼𝐼𝑖𝑖1 , 𝐼𝐼𝑖𝑖2 , … , 𝐼𝐼𝑖𝑖𝑘𝑘}
is a set of items from possible items
𝐼𝐼 = 𝐼𝐼1, … , 𝐼𝐼𝑚𝑚 such that 𝑘𝑘 ≤ 𝑚𝑚
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Formulation of the Problem
6
An association rule is an implication of the form
, where x, y ⊂ I are sets of items
and – no intersection
– no item is in both
– why? (circular implication)
An itemset can be viewed as a conjunction of items
x⇒y
𝑥𝑥 ∩ 𝑦𝑦 = 0
called itemsets.
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
6
Support of an itemset 𝑿𝑿 is the percentage of
transactions in which that item (or itemset)
occurs
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑡𝑡 𝑥𝑥 =
|𝑥𝑥|
𝑛𝑛
Definition of Support of an item or itemset
Where 𝑛𝑛 is the number of transactions
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Assume we have the following transaction data set
Example: Market Basket Transactions
5 items (shirt, trouser, tie, socks, belt)
Possible itemsets 25 −1=31
Transaction Items
t1
shirt, trouser, tie
t2
shirt, tie
t3 trouser, socks, tie
t4
trouser, belt
t5 shirt, trouser
t6 shirt, socks
Today’s Class
Itemset Support %
Shirt 4/6 66
Trouser 4/6 66
Tie 3/6 50
Socks 2/6 33
Belt 1/6 16
Shirt, trouser 2/6 33
Shirt, tie 2/6 33
Shirt, socks 1/6 16
Shirt, belt 0
Trouser, tie 2/6 33
Trouser, socks 1/6 16
Trouser, belt 1/6 16
Tie, socks 1/6 16
Tie, belt 0
Presenter
Presentation Notes
Now we can compute the support for each
Today’s Class
Itemset Support %
Socks, belt 0
Shirt, trouser, tie 1/6 16
Shirt, trouser, socks 0
Shirt, trouser, belt 0
: :
: :
Trousers, socks, tie 1/6 16
: :
: :
: :
4 items 0
5 items 0
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Support of an Association Rule
s =support (x⇒ y)= # of (x∪ y)
n
i.e. frequency of the rule in the dataset
For an association rule the support is the
percentage of transactions that contain 𝑥𝑥 ∪
𝑦𝑦 𝑖𝑖𝑛𝑛
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Confidence (or strength) 𝛼𝛼 of an
association rule
𝑥𝑥 ⟹ 𝑦𝑦
Is the ratio of
• the number of transactions that
contains 𝑥𝑥∪𝑦𝑦
• to the number of transactions that
contain 𝑥𝑥
Confidence of an Association Rule
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
α = confidence (x⇒ y)= # of (x∪ y)
# of x
s(x ⇒ y)
s(x)=
Confidence of an Association Rule
1
3
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Example
12
Notice the non-symmetry of α , denominator is different
(notice, items can be categorial or nomial, it can be
generalized for interval data as well)
x⇒ y s α
shirt ⇒ tie 33% 50%
tie ⇒ shirt 33% 66%
shirt,tie⇒trouser 16% 50%
tie ⇒ belt 0% 0%
:
:
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Association Rule Mining Problem
The Association Rule Mining Problem is to find the
association rules that have support and confidence
above specified support and confidence thresholds
(s ′,α′)
Brute Force Method
List all possible rules and calculate their support and
confidence and select those that are larger than the
specified thresholds.
Note: This is often not practical for large data sets
m items ⇒2m −1 possible rules
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Observations
1
6
Observation 1: “Large support rules” are composed of
large support itemsets
support (x ⇒ y)= support (x∪ y)
We need an efficient way to generate large support itemsets
(call them large itemsets, (large means ≥ s′) )
Observation 2: any subset of a large itemset is also large
As x or y is
counted
in each
occurrence
+ its own
occurence
support(x ∪ y )≤ support (x )
support( x ∪ y ) ≤ support (y )
ECE 657A: Lecture 8 – ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Support Thresholds
1
7
If 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑡𝑡 𝑥𝑥 ∪ 𝑦𝑦 > 𝑠𝑠𝑠
Then 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑡𝑡 𝑥𝑥 > 𝑠𝑠𝑠
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑡𝑡 𝑦𝑦 > 𝑠𝑠𝑠
Also, a superset of an item that is not large
is not large
if support (x) < s′ then support (x∪ y)< s′
(x, y can be subsets)
ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
(most common and used in commercial packages)
Notation
Itemsets
D Dataset of transactions
Li Large itemsets of size i
Is the set of large itemsets
Candidate itemsets of size iCi
L
I
s ′,α ′ support & confidence
thresholds
Consisting of
i items
The Apriori Algorithm
ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining
Simply start with single items and find a set 𝐿𝐿1 of
large itemsets (> s′)
of size 1. Apply joins between the items of
L1 to form candidate sets of size 2, C2
Example
Suppose
and so on.
s′ =17% α′ = 51%
C1 ={shirt},{trouser},{tie},{socks},{belt}
L1 ={shirt},{trouser},{tie},{socks}
(Agrawal & Srikant, 1994)
The Apriori Algorithm
s <17% Presenter Presentation Notes C1 is the set of all currently available items at the start ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining C ={shirt,trouser},{ },{ }2 shirt,tie shirt,socks {trouser,tie},{trouser,socks} {tie,socks} L2 ={shirt,trouser},{shirt,tie},{trouser,tie} C3 ={shirt,trouser,tie} L3 =φ Suppose is not large. Should C3 beφ? As one of its subsets is not large pruning {shirt,tie} Joining and Pruning 2 0 ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining 17 Rules: for each ∈L for each x ⊂ x ≠φ if support()≥α′ then accept x⇒( − x) support(x) s(shirt , trouser ) s(shirt ) s(shirt,trouser ) s(trouser ) = 50 reject = 50 reject shirt ⇒ trouser trouser ⇒ shirt Apriori Algorithm Example ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining s(shirt ) s(tie , shirt ) = 2 = 66 accept s(tie ) s(trouser , tie ) s(trouser ) s(tie , trouser ) = 2 = 66 accept s(tie ) 3 = 50 reject 3 s(shirt , tie )= 50 reject tie ⇒ shirt tie⇒ trouser shirt ⇒ tie trouser ⇒ tie Apriori Algorithm Example ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining 20 K = K +1 CK =Apriori -Gen (LK−1, s′) L = L1 = set itemsets of size1 that meets the s′ threshold K =1 Repeat Generate candidates by doing the join step and prune step Initialize count for each candidate in CK to 0 for each transaction in D Increment count for any matching item in CK endfor LK = items from CK with count ≥ s′ L = L∪ Lk Until (LK is empty)or (K = max # of items) Return L Apriori Algorithm Presenter Presentation Notes Iteratively builds those lists of large item sets by multiple passes of the database 21 Apriori −Gen(Li−1, s′) Ci =φ for each I∈Li-1 do for each J ≠ I∈Li-1do if i-2 of the elements in I and J are equal then Ci = Ci ∪{I∪ J} Remove any items that are< s′ Remove any items if any subset is < s′ Complexity of Apriori is 0(m)scans of the database, m is the number of items.Each scan goes through the n transactions (i.e. 0(nm)) Presenter Presentation Notes Given the previous large item set and a threshold scan all remaining pairs for possible inclusion into th elarge item set ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining Apriori algorithm requires large # of scans A. How to Deal with Large Databases Some approaches on how to deal with this include: 22 Issues & Further Development Sampling the database (Reduce the support to get largest itemsets in the sample, then scan the database to get larger itemsets) Partitioning the database (large itemsets must be large in at least one of the partitions) Task Parallel Parallel & distributed algorithm Hashing function & tree structure Data Parallel ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining B. Dealing with Quantitative Data Rather than dealing with categorial values only some attributes will have numeric values. The range of these values may be partitioned into intervals and the intervals are used as values in the itemsets which may increase the possible items. e.g a rule might be Acustomer who buys a shirt between $40 and $60 Will also buy a tie between $20 and $30 Support needs to check matching interval or superset Issues & Further Development ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining Other Measures forAssociation Use probabilities S(A⇒ B)= P(A,B) frequency of A,B α(A⇒ B)= P(B A) frequency (A,B)/freq(A) Problem with confidence: • Ignores P(B) : obviousness of a rule has no impact, prior probability of association ignored Support: Confidence: Measuring the Quality of Rules ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining Interest/Lift/Surprise and Leverage 2 8 lift(A⇒ B)= support(A∪ B) (support(A)⋅support(B) • measures how many more times they occur together than expected if they were independent. • Problem: symmetric, so direction isn’t clear. Leverage • measures difference of Aand B appearing together and what would be expected if they are independent leverage(A⇒ B)= support(A∪B)− (support(A)⋅support(B) Lift ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining A⇒ B ≡ ¬A∪B = ¬(A∩¬B) Conviction 2 9 support(A)support(¬B) support(AZ ∩ ¬B) • inverse of interest • measures the frequency of Aand absence of B • Value =1 if A and B not releated, Value=infinity is always related conviction = ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining Web Mining The web has a lot of data that is targeted for data mining tasks. http://www.worldwidewebsize.com/ • Web pages content: text, image, videos, audio, blogs. • Web usage: log records of use, log record of servers • Web structure: links, ranking, tags, Data is dynamic, mainly semi- structured, linked, noisy and redundan27t. (Margaret Dunham, Data Mining Introductory and Advanced Topics, Prentice Hall, 2003.) http://www.worldwidewebsize.com/ ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining Web Content Mining 3 1 Data: webpages, search engine results Purpose: • Improve search engine results: returning more relevant content • Personalization Recommended Methods and Tools: Text mining, multimedia mining, sentiment and opinion mining, classification, clustering, association rules ECE 657A: Lecture 8 - ClusteringMark CrowleyMark Crowley ECE 657A: Lecture 8 – Association Rule Mining Web Usage Mining 3 2 Purpose: • Improve server performance : speeding response by prediction • E-commerce: identify target clients and potential preferences • Page ranking Methods and Tools: Data structures, information retrieval, clustering of users, association rules Data: access log files on servers, user profiles, cookies Mining Rule Association Example: Market Basket Transactions Association Rules Applications of Using These Rules� Formulation of the Problem Formulation of the Problem� Definition of Support of an item or itemset Example: Market Basket Transactions Slide Number 9 Slide Number 10 Support of an Association Rule Confidence of an Association Rule Confidence of an Association Rule Example Association Rule Mining Problem Observations Support Thresholds The Apriori Algorithm The Apriori Algorithm� Joining and Pruning� Apriori Algorithm Example Apriori Algorithm Example� Apriori Algorithm Slide Number 24 Issues & Further Development� Issues & Further Development Measuring the Quality of Rules� Interest/Lift/Surprise and Leverage Conviction Web Mining Web Content Mining Web Usage Mining