CS计算机代考程序代写 Neural Networks

Neural Networks
107

Neurons
𝑥1
𝑥2

𝑥𝑑 Input: 𝑥 ∈ R𝑑
activation function
d: input dim m: output dim
𝑤 1
𝑤2 𝑤𝑑
Σf𝑑
𝑧
𝑎 𝑎=𝑓(𝑧)=𝑓(෍𝑥𝑤 +𝑤) 𝑗𝑗0
𝑤0
𝑗=1
activation 𝑎 ∈ R
108

Layers: Fully Connected
#outputs = #units = m
𝑊: 𝑑 × 𝑚 matrix of weights
𝑊 : 𝑚 × 1 vector 0
𝐴=𝑓𝑍 =𝑓(𝑊𝑇𝑋+𝑊) 0
𝑓:R→R Example:𝑓 𝑥 =𝑥 Apply elementwise:
𝑧 𝑓(𝑧 ) 11
𝑥1
𝑥2 ⋮ 𝑥𝑑
Σf
Σf
⋮ ⋮
𝑎1 𝑎2
Σ f𝑎𝑚 𝑓⋮=⋮=𝐴 𝑧𝑚 𝑓(𝑧𝑚)
109

Multiple layers
𝑋=𝐴
𝑍 𝑓1 𝐴 𝑍 𝑓2 𝐴 … 𝐴 𝑍 𝑓𝐿 𝐴 Loss 𝑊1 𝑊2 𝑊𝐿
𝑑(1) ×1
layer
𝑚(1) ×1 𝑚(1) ×1
𝑑(2) × 1 𝑚(2) × 1 …
𝑊1 𝑊2 𝑊𝐿
0 1 1 2 2 𝐿−1 𝐿 𝐿
000
110
𝐿=#layers

Activation Functions
112

Training Neural Networks
•Supervisedlearning:𝒟𝑛 = 𝑥1 ,𝑦1 ,…, 𝑥𝑛 ,𝑦𝑛 ; • Loss:minσ𝑛 𝐿𝑜𝑠𝑠(𝑁𝑁 𝑥 𝑖 ;𝑊 ,𝑦 𝑖 )
𝑤 𝑖=1 •Need:∇𝑊𝐿𝑜𝑠𝑠(𝑁𝑁 𝑋;𝑊 ,𝑦)
•➔error back-propagation («back-prop») == computing gradients • See the linked YouTube video series on the VLE!
113