CS代考 Chapter 8. Classification: Basic Concepts

Chapter 8. Classification: Basic Concepts
n Classification: Basic Concepts n Decision Tree Induction
n Bayes Classification Methods n Rule-Based Classification
n Model Evaluation and Selection n Summary

Using IF-THEN Rules for Classification
n A rule-based classifier uses a set of IF-THEN rules for classification.
n An IF-THEN rule is an expression in the following format: IF condition THEN conclusion
n Here is one example rule:
R1: IF age = youth AND student = yes THEN buys_computer = yes
n Rule antecedent (or precondition): “age = youth AND student = yes ”
n Rule consequent: “buys_computer = yes”
n R1 can also be written as:
R1: (age = youth) AND (student = yes) => (buys_computer = yes)

Using IF-THEN Rules for Classification
n If the condition (i.e., all the attribute tests) in a rule antecedent holds true for a given tuple, we say that the rule antecedent is satisfied (or simply, that the rule is satisfied) and that the rule covers the tuple.
n Assessment of a rule: coverage and accuracy n D: training data set
n ncovers = # of tuples covered by R
n ncorrect = # of tuples correctly classified by R

Using IF-THEN Rules for Classification
n Let’s go back to the example data set in Table 8.1.
n These are class-labeled tuples from the AllElectronics
customer database.
n Our task is to predict whether a customer will buy a computer.
n Consider rule R1:
n R1 covers 2 of the 14 tuples.
n R1 can correctly classify both tuples. n Therefore, we have:
Coverage(R1) = 2/14 = 14.28% Accuracy(R1) = 2/2 = 100%

Using IF-THEN Rules for Classification
n Let’s see how we can use rule-based classification to predict the class label of a given tuple, X.
n If a rule is satisfied by X, the rule is said to be triggered. For example, suppose we have:
n X satisfies R1, which triggers the rule. If R1 is the only rule satisfied, then the rule fires by returning the class prediction for X.
n Note that triggering does not always mean firing because there may be more than one rule that is satisfied!
n If more than one rule is triggered, we have a potential problem. What if each of them leads to a different class?

Using IF-THEN Rules for Classification
n If more than one rule is triggered, we need a conflict resolution strategy to figure out which rule gets to fire and assign its class prediction to X.
n There are many possible strategies. We will study two strategies: size ordering and rule ordering.
n Size Ordering:
n The size ordering scheme assigns the highest priority to the triggering rule that has the “toughest” requirements, where toughness is measured by the rule antecedent size.
n That is, the triggering rule with the most attribute tests is fired.

Using IF-THEN Rules for Classification
n Rule Ordering:
n Rule ordering scheme prioritizes the rules in advance. The
ordering may be class-based or rule-based. n Class-based Ordering:
n The classes are sorted in order of decreasing “importance”
n For example, the classes can be sorted in decreasing order of prevalence. That is, all the rules for the most prevalent (or most frequent) class come first, the rules for the next prevalent class come next, and so on.
n Within each class, the rules are not ordered—they don’t have to be because they all predict the same class (and so there can be no class conflict!).

Using IF-THEN Rules for Classification
n Rule-based Ordering:
n The rules are organized into one long priority list, according to some measure of rule quality, such as accuracy, coverage, or size (number of attribute tests in the rule antecedent), or based on advice from domain experts.
n When rule ordering is used, the rule set is known as a decision list.
n With rule ordering, the triggering rule that appears earliest in the list has the highest priority, and so it gets to fire its class prediction.
n Any other rule that satisfies X is ignored.
n Most rule-based classification systems use a class-based rule- ordering strategy.

Using IF-THEN Rules for Classification
n Another problematic scenario: No rule is satisfied by X. How can we determine the class label of X?
n In this case, a fallback or default rule can be set up to specify a default class, based on a training set.
n This may be the class in majority or the majority class of the tuples that were not covered by any rule.
n The default rule is evaluated at the end, if and only if no other rule covers X.
n The condition in the default rule is empty.
n In this way, the rule fires when no other rule is satisfied

Rule Extraction from a Decision Tree
n Decision tree classifier is a popular classification method.
n It is easy to understand how decision trees work and they are
known for their accuracy.
n Decision trees can become large and difficult to interpret.
n In this subsection, we look at how to build a rule-based classifier by extracting IF-THEN rules from a decision tree.
n Compared to a decision tree, the IF-THEN rules may be easier for humans to understand, particularly if the decision tree is very large.

Rule Extraction from a Decision Tree
n To extract rules from a decision tree, one rule is created for each path from the root to a leaf node.
n The splitting criteria along a given path are logically ANDed to form the rule antecedent (“IF” part).
n The leaf node holds the class prediction, forming the rule consequent (“THEN” part).

Rule Extraction from a Decision Tree
n Example: Rule extraction from our buys_computer decision-tree
IF age = young AND student = no
IF age = young AND student = yes
IF age = mid-age
IF age = old AND credit_rating = excellent THEN buys_computer = no IF age = old AND credit_rating = fair THEN buys_computer = yes
THEN buys_computer = no THEN buys_computer = yes THEN buys_computer = yes

Rule Extraction from a Decision Tree
n Note that the rules extracted directly from the tree in this manner are mutually exclusive and exhaustive.
n Mutually exclusive means that we do not have rule conflicts here because no two rules will be triggered by the same tuple. (We have one rule per leaf, and any tuple can map to only one leaf.)
n Exhaustive means there is one rule for each possible attribute– value combination.
n Therefore, this set of rules does not require a default rule.
n Consequently, the order of the rules does not matter—they are unordered.

Rule Extraction from a Decision Tree
n Since we end up with one rule per leaf, the set of extracted rules are not much simpler than the corresponding decision tree!
n The extracted rules may be even more difficult to interpret than the original trees in some cases.
n Although it is easy to extract rules from a decision tree, we may need to prune the resulting rule set to make the rule-based classifier simpler.
n C4.5 extracts rules from an unpruned tree, and then prunes the rules. The details will not be covered due to time limitation.

Metrics for Classifier Performance Evaluation
n We will study the metrics used to assess how good or how “accurate” your classifier is at predicting the class label of tuples.
n For simplicity, we focus on the problems involving two classes. n Before we discuss the various measures, we need to become
comfortable with some terminology.
n Positive tuples: The tuples belonging to the class of interest. n Negative tuples: All other tuples.
n Given two classes, for example, the positive tuples may be those corresponding to buys_computer = yes while the negative tuples are those corresponding to buys_computer = no.

Metrics for Classifier Performance Evaluation
n Suppose we apply a classifier to a testing data set.
n The classifier places the tuples into two classes (i.e. positive- tuple class and negative-tuple class) and generates the following two numbers:
n P is the number of positive tuples. n N is the number of negative tuples.
n Thereafter, for each tuple, we compare the classifier’s class label prediction to the tuple’s known class label.

Metrics for Classifier Performance Evaluation
n The following four terms are the “building blocks” used by many evaluation measures.

Metrics for Classifier Performance Evaluation
n The confusion matrix is a useful tool for analyzing how well your classifier can recognize tuples.
n TP and TN tell us when the classifier is getting things right
n FP and FN tell us when the classifier is getting things wrong.

Metrics for Classifier Performance Evaluation
n Here is the example confusion matrix for the computer shopping prediction problem:

Metrics for Classifier Performance Evaluation
n There are a few other metrics that can be used to evaluate the performance of classifiers. We focus on accuracy in this course.
n The accuracy of a classifier on a given test set is the percentage of test set tuples that are correctly classified by the classifier.
n In the previous example, accuracy is equal to: accuracy = (6954 + 2588)/(7000+3000) = 95.42%

n Classification is a form of data analysis that generates models describing important data classes.
n Effective and scalable methods have been developed for decision tree induction, Naive Bayesian classification, rule-based classification.
n There are a few evaluation metrics for classifiers, we focus on accuracy in this course.
n No single method has been found to be superior over all others for all data sets

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts