Week 5 Summary (DSME5110F)
Association Rule (Market Basket Analysis)
• Objective: find interesting associations are found (“what goes with what”?) • Idea:
– Examine all possible rules between items in an if-then format
Copyright By PowCoder代写 加微信 powcoder
∗ Problem (Curse of Dimensionality): the number of possible rules grows exponentially in the
number of items
– Solution: consider only “frequent itemsets”
∗ Support, Confidence, Lift Ratio
∗ Apriori principle: All subsets of a frequent itemset must also be frequent ∗ Apriori Algorithm
• How to implement in R:
– apriori in the arules package: apriori(data = mydata, parameter = )
∗ Only creates rules with one item as the consequent
– inspect(Myrules)
• Review is required to identify useful rules and to reduce redundancy
Bayes’ formula
• A formula to calculate conditional probabilities • P(B|A)= P(A|B)P(B)
P(A|B)P(B)+P(A|Bc)P(Bc) • Application: Problem
Naïve Bayes classifier
• Use Πpi=1P(Xi = xi|Y = l) to approximate P(X1 = x1,X2 = x2,··· ,Xp = xp|Y = l)
• Why not Bayes? Because when calculating P(B|A), P(A) = 0 is possible in many situations • How to implement in R:
– naiveBayes (gives you probability and conditional probability)
– predict the observation to be in the class with the largest probability • Comments:
– The resulted ranking is more accurate than the estimated probability
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com