Machine Learning 1 TU Berlin, WiSe 2020/21
Fisher Linear Discriminant
In this exercise, we apply Fisher Linear Discriminant as described in Chapter 3.8.2 of Duda et al. on the UCI Abalone dataset. A description of the dataset is given at the page https://archive.ics.uci.edu/ml/datasets/Abalone. The following two methods are provided for your convenience:
• utils.Abalone. init (self) reads the Abalone data and instantiates two data matrices corresponding to: infant (I), non-infant (N).
• utils.Abalone.plot(self,w) produces a histogram of the data when projected onto a vector w, and where each class is shown in a different color.
Sample code that makes use of these two methods is given below. It loads the data, looks at the shape of instantiated matrices, and plots the projection on the first dimension of the data representing the length of the abalone.
In [1]: %matplotlib inline
import utils,numpy
# Load the data
abalone = utils.Abalone()
# Print dataset size for each class
print(abalone.I.shape, abalone.N.shape)
# Project data on the first dimension
w1 = numpy.array([1,0,0,0,0,0,0])
abalone.plot(w1,’projection on the first dimension (length)’)
((1342, 7), (2835, 7))
Implementation (10 + 5 + 5 = 20 P)
• Create a function w = fisher(X1,X2) that takes as input the data for two classes and returns the Fisher linear discriminant.
1
• Create a function objective(X1,X2,w) that evaluates the objective defined in Equation 96 of Duda et al. for an arbitrary projection vector w.
• Create a function z = phi(X) that returns a quadratic expansion for each data point x in the dataset. Such expansion consists of the vector x itself, to which we concatenate the vector
of all pairwise products between elements of x. In other words, letting x = (x1, . . . , xd) denote the d-dimensional data point, the quadratic expansion for this data point is a d · (d + 3)/2 dimensional
vector given by φ(x) = (xi )1≤i≤d ∪ (xi xj )1≤i≤j ≤d . (x1, x2, x21, x2, x1x2).
In [2]: def fisher(X1,X2):
##### Replace by your code
import solutions
return solutions.fisher(X1,X2) #####
def objective(X1,X2,w):
##### Replace by your code
import solutions
return solutions.objective(X1,X2,w) #####
def expand(X):
##### Replace by your code import solutions
return solutions.expand(X) #####
For example, the quadratic expansion for d = 2 is
Analysis (5 + 5 = 10 P)
• Print value of the objective function and the histogram for several values of w: • w is a canonical coordinate vector for the first feature (length).
• w is the difference between the mean vectors of the two classes.
• w is the Fisher linear discriminant.
• w is the Fisher linear discriminant (after quadratic expansion of the data).
In [3]: ##### REPLACE BY YOUR CODE %matplotlib inline
import solutions solutions.analysis() #####
First dimension (length): 0.00048
2
Means Linear: 0.00050
Fisher: 0.00057
Fisher after expansion: 0.00077
3
4