Feature representation
69
Transforming data
•𝜑:R𝑑→R𝐷
•Originally:h𝑥;𝜃,𝜃0 =𝑠𝑖𝑔𝑛𝜃𝑇𝑥+𝜃0
𝜃=𝜃,…,𝜃 ;𝜃 1𝑑0
𝜃 =[𝜃,…,𝜃,𝜃] 𝑛𝑒𝑤 1 𝑑0
•𝜑 𝑥,…,𝑥 =𝑥,…,𝑥,1 1 𝑑 1 𝑑
• New: h 𝑥; 𝜃 = 𝑠𝑖𝑔𝑛(𝜃𝑇
𝑛𝑒𝑤 𝑛𝑒𝑤 𝑛𝑒𝑤
)
𝑥
70
Example – 1D
𝑥2
𝑥1
𝑥1
71
Polynomial basis
𝑥1
𝜑𝑥 =[𝑥,𝑥2]
𝑥2 = 𝑥2
𝑥 = 𝑥1
72
Polynomial basis
• Parameter 𝑘 (order)
𝑘
𝑑=1
general
0
[1]
[1]
1
[1, 𝑥]
[1,𝑥1,…,𝑥𝑑]
2
[1, 𝑥, 𝑥2]
[1,𝑥 ,…,𝑥 ,𝑥2,𝑥 𝑥 ,𝑥 𝑥 ,…,𝑥2,…,𝑥2,…] 1𝑑112132𝑑
3
[1, 𝑥, 𝑥2, 𝑥3]
…
73
Feature representation strategies
• Discrete data: #gears, car manufacturers, animals, … • Numeric (R)
• One-hot encoding: m Booleans – [1,0,0,…,0], [0,1,0,…,0], …, 0,0,0,…,1
• Factoring. E.g.: Blood types: A+, A-, B+, B-, AB, O+, O- • =>[A,B,AB,0],[+,-]
• =>[A,notA],[B,notB],[+,-]
• Text: Bag of words (bow), word2vec •…
74
Standardising numerical features
mean
𝑥 𝑖 = ( 𝑥 𝑖 − 𝑥 ҧ 𝑖 )
𝜎 𝑖
75
standard deviation