程序代做 DSME5110F: Statistical Analysis

DSME5110F: Statistical Analysis
Lecture 4 Introduction to Probability; Application: Association Rule

Association Rules: Frequently Bought Together

Copyright By PowCoder代写 加微信 powcoder

• Counting Rules
• Probabilities
• Probability Rules
• Conditional Probability
• Association Rule – Support
– Confidence

Important Definitions
• Probability – a number between 0 and 1 that measures the likelihood that some event will occur
– An event with probability 0 cannot occur, whereas an event with probability 1 is certain to occur.
– An event with probability greater than 0 and less than 1 involves uncertainty, and the closer its probability is to 1, the more likely it is to occur.
• Experiment – any process that can result in one of several well-defined outcomes that cannot be predicted with certainty beforehand (e.g. rolling a die, making an investment, etc.)
• Sample space – the set of all possible outcomes resulted from an experiment (written as S). For example,
– Flipping a coin: S = {H, T}
– Rollingadie:S={1,2,3,4,5,6}
– Number of females in a group of 100 randomly selected people: S = {0, 1, …, 100}
• Sample point – a member of the sample space; any one particular experimental outcome

Counting Rule 1: Multiplication Principle
• If an experiment can be characterized as a sequence of 𝑁𝑁 steps with 𝑛𝑛1 possible results on the first step, 𝑛𝑛2 possible results on the second step, etc., then the total number of outcomes for the overall experiment is equal to the product of the number of results on each step:𝑛𝑛 ×𝑛𝑛 ×⋯×𝑛𝑛 .
• Example: A password consists of two letters of the alphabet followed
• Example: In how many different ways can one order soup, a sandwich, a dessert, and a drink for lunch, if there is a choice of four different soups, three kinds of sandwiches, five desserts, and four drinks?
– Thetotalnumberofwaysis4×3×5×4=240.
by three digits chosen from 0 to 9. Repeats are allowed. How many
– The total number of passwords is 26 × 10 = 676,000
different possible passwords are there?

Counting Rule 2: Combination Rule • The number of combinations of 𝑁𝑁 distinct objects taken 𝑛𝑛 at a time (the number of
unique subsets of size 𝑛𝑛)
𝐶𝐶𝑁𝑁=𝑁𝑁= 𝑁𝑁!
𝑛𝑛 𝑛𝑛𝑛𝑛!𝑁𝑁−𝑛𝑛!
Here𝑛𝑛!=𝑛𝑛× 𝑛𝑛−1 ×⋯×1and0!=1.
• R function: choose(N,n)
• Example: Three balls (numbered 1, 2, and 3) are placed in an urn. In drawing 2 balls at random, how many different pairs are possible?
– Sample space is S = {(1; 2); (1; 3); (2; 3)} and contains 3 sample points
– Using R: choose(3, 2)
• Example: How many unique five-card hands are possible with a deck of 52 cards? – 52! = (52 × 51 × 50 × 49 × 48)/120 = 2,598,960
– Using R: choose(52, 5)

• Counting Rules
• Probabilities
• Probability Rules
• Conditional Probability
• Association Rule – Support
– Confidence

Assigning Probabilities
• Basic requirements
– The probability of any sample point must be between 0 and 1, inclusive. – Sum of all sample points’ probabilities must be 1.
• Classical method – the probability that an event will occur is equal to the number of sample
points in the event divided by the total number of sample points in the sample space.
– Applies when all possible outcomes are equally likely and counting or enumeration of outcomes are
possible. It is particularly useful in analysing games of chance.
• Relative frequency method – the probability of an event is the proportion of time that events
of the same kind will occur in the long run.
– Widely used in situations when experimental outcomes are not assumed equally likely and empirical
data for outcomes are available. Based more on real-world empirical observation than on theoretical assumptions about the likelihood of any experimental outcome.
• Subjective method – It interprets probabilities as personal or subjective evaluations. Such
probabilities express the strength of one’s belief with regard to the uncertainties that are
involved (e.g., probability of major economic crisis, etc.)
– This method applies when experimental outcomes are not assumed equally likely and relative
frequency data are either unavailable or uncollectable.

Example 4.1:
Classical Method of Assigning Probabilities
• An experiment consists of drawing one card from a deck of 52 cards.
– How many experimental outcomes are there? 52 –What is the probability of drawing a five of
hearts? 1/52
– What is the probability of drawing a spade? 13/52 = 1/4

Example 4.2:
Classical Method of Assigning Probabilities
• An experiment consists of rolling two six-faced dice.
– How many experimental outcomes are there? 36
– What is the probability that the sum of two dice is 7?
– What is the probability that the sum of two dice is an even number? 18/36 = 1/2
6/36 = 1/6
2 3 4 5 6 7 8 9 10 11 12

Example 4.3: Probabilities of Winning Mark 6 Prizes
Mark Six (Chinese: 六合彩) is a lottery game organized by the Jockey Club. The player selects 6 out of 49 numbers from 1 to 49. The winning numbers, which include 6 numbers plus one extra number, are selected automatically from a lottery machine. The winning probabilities for the prizes are given in the table below.
1 2,330,636
Prize Criteria
1st All 6 drawn numbers
2nd 5 out of 6 drawn numbers, plus the extra number
3rd 5 out of 6 drawn numbers
4th 4 out of 6 drawn numbers, plus the extra number 5th 4 out of 6 drawn numbers
6th 3 out of 6 drawn numbers, plus the extra number 7th 3 out of 6 drawn numbers
49 = 13,983,816
Probability
= 55,491.33
5 491 6 642 4 491 6 642 4 492 6 642 3 492 6642
= 22,196.53
= 1,082.76

Examples of Relative Frequency Method of Assigning Probabilities
• A sales manager wants to estimate the probability that customers will buy her company’s new product. In a test market evaluation, 1000 potential customers were contacted, with the result that 250 purchased the product but 750 did not. What is a reasonable estimate of the probability that a customer will buy the company’s new product?
– P(buy)=250/1000=0.25
• You want to estimate the probability that an unbalanced coin will land on heads. You flipped the coin 1000 times and found that it landed on heads 650 times. What is a reasonable estimate of the probability that this same coin will land on heads in any future flip?
– P(Heads)=650/1000=0.65

Events and Probabilities
• Event – In many problems we are interested in results that are not given directly by a specific sample point of a sample space. Instead, we are interested in a well-defined collection of sample points; a subset of the sample space. Such a subset is called an event.
• Probability of an event – the sum of the probabilities of the sample points comprising that event.

Example 4.2 (Continued)
• Rolling two dice, the set of all possible outcomes is given in the bottom-left chart below.
• Let X be the sum of rolling two dice. The probability of X is given in the bottom-right table, along with the frequency plot.
• We can define the events of this experiments in many different ways. For example,
If we define E as the event that the sum of two dice is even, then
P(E) = P(Sum = 2, 4, 6, 8, 10, or 12) = 1/36 + 3/36 + 5/36 + 5/36 + 3/36 + 1/36 = 18/36 = 1⁄2.
If we define N6 as the event that neither die shows the face with 6 dots, then P(N6) = 25/36.
2 3 4 5 6 7 8 9 10 11 12

• Counting Rules
• Probabilities
• Probability Rules
• Conditional Probability
• Association Rule – Support
– Confidence

Complement of an Event
• The simplest probability rule involves the complement of an event.
• If 𝐴𝐴 is any event, then the complement of 𝐴𝐴, denoted by 𝐴𝐴𝑐𝑐, is the event that 𝐴𝐴 does not occur (consisting of all sample points that are not in event 𝐴𝐴).
• If the probability of 𝐴𝐴 is P 𝐴𝐴 , then the probability of its complement is given bPy (𝐴𝐴𝑐𝑐 ) = 1 – P(𝐴𝐴).

Union and Intersection of Events
• Union of events – a new event contains all the
sample points belonging to event 𝐴𝐴 or event 𝐵𝐵
• Intersection of events – a new event contains all
sample points belonging to both events 𝐴𝐴 and 𝐵𝐵

Mutually Exclusive and Collectively Exhaustive Events
• Events are mutually exclusive if at most one of them can occur. That is, if one of them occurs, then none of the others can occur.
– If A and B are mutually exclusive events, then A ∩ B=∅.
• Events are collectively exhaustive if, together, they exhaust all possibilities – one of the
events must occur.
– If A, B and C are collectively exhaustive events, then P(A∪B∪C)= 1.
Event Event AB
Not mutually exclusive
Not collectively exhaustive
Event Event AB
Mutually exclusive
Collectively exhaustive

Union of Events: Addition Rule
• For two events A and B, the probability that either A or B occurs is:
P(A ∪ B) = P(A) + P(B) – P(A ∩ B)
• If A and B are mutually exclusive, then
P(A ∪ B) = P(A) + P(B)

• Counting Rules
• Probabilities
• Probability Rules
• Conditional Probability
• Association Rule – Support
– Confidence

Conditional Probability
• Let 𝐴𝐴 and 𝐵𝐵 be any events with probabilities P(𝐴𝐴) and
P(𝐵𝐵), respectively.
– Typically, the probability P(𝐴𝐴) is assessed without knowledge of
whether 𝐵𝐵 occurs.
• However, if you are told that 𝐵𝐵 has occurred, then the
probability of 𝐴𝐴 might change.
– The new probability of 𝐴𝐴 is called the conditional probability of
𝐴𝐴 given 𝐵𝐵.
• Conditional probability of A given B:
P𝐴𝐴∩𝐵𝐵 =P𝐴𝐴𝐵𝐵 ×P(𝐵𝐵)
P(𝐴𝐴|𝐵𝐵) = P(𝐴𝐴∩𝐵𝐵)/P(𝐵𝐵)
• The above equation suggests that the joint probability of A

Probabilistic Independence
• Two events are independent when the probability that one event will happen does not depend on whether or not the other event has happened.
• That is, if A and B are independent events, then P𝐴𝐴𝐵𝐵 =P𝐴𝐴 andP𝐵𝐵𝐴𝐴 =P𝐵𝐵
• For independent events A and B, their joint probability is:
P 𝐴𝐴∩𝐵𝐵 =P(𝐴𝐴)×P(𝐵𝐵)

• Counting Rules
• Probabilities
• Probability Rules
• Conditional Probability
• Association Rule – Support
– Confidence

Background
• The availability of detailed information on customer transactions (market basket databases)
– E.g., using bar-code scanners in supermarket…
– Each record lists all items bought by a customer on a
• Want to know if certain groups of items are consistently purchased together
– “what goes with what”
– Such information can be used for making decisions on store layouts, item placement, cross-selling, promotions, catalog design, identifying customer segments based on buying patterns…
single-purchase transaction

Association Rules
• Also called “market basket analysis” and “affinity analysis”
• Association rules are always composed from subsets of itemsets and are denoted by relating one itemset on the left-hand side (LHS) of the rule to another itemset on the right-hand side (RHS) of the rule
– {peanut butter, jelly}{bread}
– “peanut butter and jelly imply bread”
– The LHS is the condition that needs to be met in order to trigger the rule, and the RHS is the expected result of meeting that condition

Example 4.4: Purchases of Phone Faceplates
• A store sells accessories for cellular phones
• Customers who purchase multiple faceplates from a choice of six different colors get a discount
• The store manager wants to know what colors of faceplates customers are likely to purchase together
• Transaction Dataset:

Rules: If-then
• The idea: examine all possible rules between items in an if- then format, select only those that are most likely to be indicators of true dependence
– “if” part = antecedent
– “then” part = consequent
– A rule can be represented as antecedentconsequent • Therulesareprobabilisticinnature
– “Itemset” = the set with items (e.g., products) comprising the antecedent, or consequent, or both
• Antecedent and consequent are disjoint (i.e., have no items in common)
• Itemsets are not records of what people buy – they are simply possible combinations of items, including single items

Many Rules are Possible
• For example: Transaction 1 supports several rules, such as
– “If red, then white” (“If a red faceplate is purchased, then so is a white one”)
• The antecedent is {red} and the consequent is {white} • {Red}{White}
– “If white, then red”
• The antecedent is {white} and the consequent is {red} • {White}{Red}
– “If red and white, then green”
• The antecedent is {red, white}, and the consequent is {green} • {Red, White}{Green}
– + many more

Process of Rule Selection
• Generate all rules that meet specified support & confidence
• Two stages:
– Find all “frequent” itemsets
• all the itemsets that meet a minimum support threshold – From these itemsets, generate association rules
with sufficient confidence
• meeting a minimum confidence threshold

• Counting Rules
• Probabilities
• Probability Rules
• Conditional Probability
• Association Rule – Support
– Confidence

Frequent Itemsets
• The first step in association rules is to generate all the rules that would be candidates for indicating associations between items
• Ideally, we want to create all possible combinations of items
– Problem (Curse of Dimensionality): Generating all these combinations requires a long computation time that grows exponentially in the number of items p
– Example:
• if there are 6 items (p=6), there are 602 rules • if there are 7 items (p=7), there are 1932 rules
• Solution: consider only “frequent itemsets” – Criterionforfrequent:support

Support (Measure of Frequent)
• Support = a percentage of the total number of records that include both the antecedent and the consequent in the database
– Example: support for the itemset {red, white} is 4 out of 10 transactions, or 40%
– Measures the degree to which the data “support” the validity of the rule
• A frequent itemset is therefore defined as an itemset that has a support that exceeds a selected minimum support, determined by the manager

Apriori principle
• All subsets of a frequent itemset must also be frequent
– If {A, B} is frequent, then {A} and {B} must both be frequent
• Therefore, if we know that {A} does not meet a desired support threshold, there is no reason to consider {A, B} or any itemset containing A

Apriori Algorithm 1. User sets a minimum support criterion
For 𝑝𝑝 products…
Next, generate list of one-itemsets that meet the support criterion – Frequencyofthetransactionsincludingtheitem
Use the list of one-itemsets to generate list of two-itemsets that meet the support criterion
– If a certain one-itemset did not exceed the minimum support, any larger size itemset that includes it will not exceed the minimum support either
Continue for 𝑘𝑘 = 1,2,⋯,𝑝𝑝
Use list of two-itemsets to generate list of three-itemsets
– In general, generating 𝑘𝑘-itemsets uses the frequent (𝑘𝑘 − 1)-itemsets that were generated in the preceding step

Example 4.4: Phone Faceplate
• Assume minimum support is 0.2
• Iteration 1: {red}, {white}, {blue}, {orange}, {green} are all frequent
• Iteration 2: need to consider all possible combinations
– {Red, White}, {Red, Blue}, {Red, Green}, {White, Blue}, {White, Orange}, {White, Green} are frequent
• Iteration 3: need to consider the possible combinations: {Red, White, Blue}, {Red, White, Green}:
– {Red,White,Blue}and{Red,White,Green}arefrequent
– We do not need to consider other combinations such as {Red, White, Orange} (because {Red, Orange} is not frequent), {White, Blue, Orange} (because {Blue, Orange} is not frequent)
• Iteration 4: the only possible 4-item combination is {Red, White, Blue, Green} but we do not need to consider it because {Blue, Green} is not frequent

Example 4.4: Phone Faceplates
Itemset Support
{red} 0.5 {white} 0.8 {blue} 0.5 {orange} 0.3 {green} 0.2 {red, white} 0.4 {red, blue} 0.3 {red, green} 0.2 {white, blue} 0.4 {white, orange} 0.3 {white, green} 0.2 {red, white, blue} 0.2 {red, white, green} 0.2

Possible Rules
• Then in the confidence phase, one needs to consider all the possible rules from
– Two-itemset{White,Orange}
• For a two-itemset {A, B}, two rules are possible: • ABandBA
– Three-itemsets{Red,White,Blue}and{Red,White,Green}: • For a three-itemset {A, B, C}, 12 rules are possible:
• AB, AC, A{B, C}?
• BA, BC, B{A, C}?
• CA, CB, C{A, B}?
• {A, B}C? {A, C}B? {B, C}A?
– Hencetherearetotally2+2*12=26rules
• Among those association rules, the second step is to identify rules with sufficient confidence

• Counting Rules
• Probabilities
• Probability Rules
• Conditional Probability
• Association Rule – Support
– Confidence

Measures of Performance: Confidence
• Confidence of a rule: the % of antecedent transactions that also have the consequent itemset
Confidence =
no. transactions with both antecedent and consequent itemsets no. transactions with antecedent itemset
• Example: a supermarket database
– 100,000point-of-saletransactions.
– 2000 include both orange juice and apple, and 800 of these include soup purchases.
– The association rule “IF orange juice and apple are purchased THEN soup is purchased on the same trip” has a support of 0.8% (= 800/100,000) and a confidence of 40% (= 800/2000).

Measures of Performance: Lift Ratio
• Benchmark confidence = transactions with consequent as % of all
Benchmark Confidence = no. transactions with consequent itemset no. transactions in database
• Lift ratio = confidence/(benchmark confidence)
• A lift ratio greater than 1 suggests a rule that is useful in finding
– The level of association between the antecedent and consequent itemsets is higher than would be expected if they were independent
– Thelargertheliftratio,thegreaterthestrengthoftheassociation
transactions
consequent itemsets

Mathematical Representations
• Support of an itemset:
support(X) = count X
– Here N is the number of transactions in the database and
count X is the number of transactions containing itemset X. – The support of a rule (XY) is the support of {X, Y}
• Confidence of a rule:
confidence(XY)= support(X,Y)/support(X)
• Lift ratio of a rule
lift(XY) = confidence(XY)/support(Y)

Example: Rules from {red, white, green}
{red, white} => {green} {green} => {red}
{white, green} => {red}
support of {red, white, green} support of {red, white}
confidence of rule 50%
Lift = = 2.5 benchmark confidence 20%
Confidence
= 0.2/0.4 = 50%
support of {green, red} = 0.2/0.2 = 100% support of {green}
confidence of rule = 100% = 2 benchmark confidence 50%
support of {white, green, red} = 0.2/0.2 = 100% support of {white, green}
confidence of rule = 100% = 2 benchmark confidence 50%
• If the desired minimum confidence is 70%, report only rules 2, 3

• Support: indicates the impact in terms of overall size
– If only a small number of transactions are affected, the rule may be of little use
• Confidence: useful in determining the business or operational usefulness of a rule
– usefulindeterminingthebusinessoroperationalusefulnessofarule
• Lift ratio: how effective the rule is in finding consequents,
– Anefficientruleispreferredtoaninefficientrule
– However, a very efficient rule that has very low support may not be as desirable as a less efficient rule with much greater support.
comparing to random selection

Alternate Data Format: Binary Incidence Matrix
• Rows represent transactions, columns are items, and each cell has either 1 or 0, indicating the presence or absence of an item in the transaction
• Example: the phone faceplates
Transaction Red White Blue Orange Green Yellow 1110010 2010100 3011000 4110100 5101000 6

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com