程序代写代做代考 deep learning kernel database algorithm Excel data science NEURAL NETWORKS Applied Analytics: Frameworks and Methods 2

NEURAL NETWORKS Applied Analytics: Frameworks and Methods 2
1

Outline
■ Introduction to Neural Networks
■ Artificial Neuron
■ Multiple Layer Neural Networks
■ Network Architecture
■ Illustration of Neural Networks on MNIST
■ Types of Networks
■ Applications
■ Using Deep Learning at Scale
2

Deep Learning
■ Artificial Neural networks, conceived in the ‘50s as a crude approximation of how the human brain works, are the basis of what is now referred to as Deep Learning.
■ Artificial Neural Networks used to address machine learning problems are referred to as Deep Learning. Since our goal is to apply neural networks to solve machine learning problems, we will use the terms Deep Learning and Neural Networks interchangeably.
■ Deep Learning is a form of Artificial Intelligence that uses a type of machine learning called an artificial neural network with multiple hidden layers that learns hierarchical representations of the underlying data in order to make predictions given new data
3

Deep Learning
is a Form of Artificial Intelligence
4

Deep Learning
is a type of Machine Learning
Source: MachineLearningMastery.com
5

Deep Learning
is a type of Machine Learning
Source: MachineLearningMastery.com
6

Evaluation of Deep Learning Pros Cons
■ Excels at tasks such as computer vision, natural language processing, speech recognition.
■ Powers many recommender systems, and fraud detection systems.
■ Algorithms are very general and adaptive
■ Work directly on raw data. No feature engineering required.
■ Computational resource intensive ■ Tricky hyper parameterization
■ Non-optimal methods
7

Artificial Neural Network
■ Inspired by the biological neuron but they work very differently
■ Consists of a series of Neurons connected together in a network
■ Lets first examine an Artificial Neuron.
8

Artificial Neuron
■ Input (x1, x2, x3)
■ Weights or parameters (ω1, ω2, ω3)
■ Bias (ω0)
■ Neuron
■ Output (y)
■ Activation function (e.g., tanh)
9

Artificial Neuron
y
Inputs
Neuron
Output
10

Artificial Neuron
x1
x2 y
x3
Inputs
Neuron
Output
11

Artificial Neuron
x1
ω1
x ω2 y
2
ω3 Inputs
x3
Neuron
Output
12

Artificial Neuron
ω0
x1
ω1 x ω2
y
2
ω3 Inputs
x3
Neuron
Output
13

Artificial Neuron
ω0
x1
ω1
xω2 Σy
2
ω3 Inputs
x3
Neuron
Output
14

Artificial Neuron
ω0
x1
ω1
xω2 Σfy
2
ω3 Inputs
x3
Neuron
Output
15

Artificial Neural Network
■ Network of connected neurons
■ Includes
– Input Layer
– One or more Hidden Layer(s)
– Output Layer
16

Artificial Neural Network
Inputs
Hidden Layer
Outputs 17

Artificial Neural Network
■ In practice, each layer can be a
– vector (one-dimensional)
– matrix (two-dimensional array)
– tensor (n-dimensional array)
18

Deep Learning Neural Network
■ A neural network with more than one hidden layer
■ Adding more hidden layers enables the network to model progressively more complex functions
19

Deep Learning Neural Network
Source: Communications of the ACM
20

Hierarchical Representations
Greater Abstraction
Source: Data Science Central
21

MNIST
■ A large database of handwritten digits that is commonly used for training various image processing systems
■ 42000 images.
■ Each image is on a 28 x 28 pixel gray scale
■ Thus, image has a 784 pixel descriptors
Source: Wikipedia
22

MNIST
Basic Neural Network
■ MNIST Data
■ Inputs: 784
■ Hidden Layer(s): 1
■ Hidden Units or Neurons: 5
23

Activation Functions
■ Linear
■ Logistic (sigmoid)
■ Hyperbolic Tangent (tanh)
■ Rectified Linear Unit (ReLU) ■…
24

MNIST
Multi-Layer Neural Network
■ MNIST Data
■ Inputs: 784
■ Hidden Layer(s): 2
■ Hidden Units or Neurons: 5 in each layer
25

Mechanics of Neural Networks
■ Specify Neural Network hyper-parameters
– Number and nature of Inputs
– Number of hidden layers
– Number of neurons per hidden layer
– Activation function (tanh, softmax, sigmoid)
■ Determine parameter weights by minimizing a loss function (e.g., mean square error, cross-entropy, Hinge) using stochastic gradient descent
– Gradient descent is computationally demanding for large datasets. Stochastic gradient descent or mini-batch gradient descent searches for minima using a small batches of the training set rather than the entire train sample.
■ Specify regularization terms (L1 and L2) to prevent overfitting
26

Network Architectures
■ Various network architectures are possible by changing the
– Number and nature of Inputs
– Number of hidden layers
– Number of neurons per hidden layer
– Activation function
■ Adding more hidden layers makes the network deep
■ Adding more neurons per layer makes the network wide
27

Network Architecture
■ There is not a solid theoretical basis to determine the best Network Architecture
■ Often a trial-and-error process
■ Some approaches include
– Using a network architecture for a similar problem
– Progressively increasing complexity by adding more hidden units and layers until performance improvements become asymptotic
28

Tuning Hyper-Parameters
■ The term Hyper-Parameters is used to distinguish them from standard model parameters
– They define higher level concepts about the model like complexity or capacity to learn
– They cannot be learned directly from the data in the standard model training process and need to be predefined
– They are critical to model performance
29

Tuning Hyper-Parameters
■ There are many hyper-parameters to tune
– Number of hidden layers
– Number of hidden units
– Number of training iterations
– Learning rate
– Regularization
30

Tuning Hyper-Parameters
■ Two approaches to tuning – Manual:
■ Quite a common approach
■ Without domain knowledge or experience with similar problems, this can take a long time
– Automatic:
■ Using Stochastic gradient descent for Hyper-Parameters means training the model from scratch each time! This is computationally expensive and only practical for small datasets. For small models, here are two approaches
– Grid Search
– Random Search
31

MNIST
Random Search Neural Network
■ MNIST Data
■ Use Random Search
■ Number of each of the following hyper-parameters explored
– Activation Function: 6
– Network architectures: 6
– L1: 101
– L2: 101
32

Deep Learning in R ■ R Packages
– nnet
– neuralnet
– RSNNS
– deepnet
– darch
– caret
– RNN
– Autoencoder
– RcppDL
– h2o
– MXNetR ….

Other General Frameworks – Caffe
– Tensorflow
– Theano
– Torch
– Deeplearning4j – CNTK
33

TYPES OF NEURAL NETWORKS
34

Types
■ Fully Connected Networks
■ Convolutional Networks
■ Recurrent Networks
■ Generative Adversarial Networks
■ Deep Reinforcement Learning
35

Fully Connected Feed Forward Network
■ Fully Connected: Each neuron is connected to every neuron in the subsequent layer
■ Feed Forward: Neurons are only connected to neurons in a subsequent layer. There are no feed back loops.
36

Fully Connected Feed-Forward Network
37

Convolutional Neural Network
■ Adding Hidden Layers and Neurons to a Fully Connected Network results in an exponential increase in system resources
38

Convolutional Neural Network
39

Convolutional Neural Network ■ Convolution
– Technique that allows us to extract visual features from an image in small chunks. Each neuron in a convolution layer is responsible for a small cluster of neurons in the preceding layer.
■ Filter
– Bounding box that defines the cluster of neurons. Also known as a kernel.
– Filters help identify certain features of an image such as sharpen an image or detect edges.
■ Pooling (also known as subsampling or downsampling)
– Reduces the number of neurons in the previous layer while still retaining the most important information
40

Convolutional Neural Networks
■ Find application in
– Image recognition
– Image processing
– Image segmentation
– Video analysis
– Language processing
41

Recurrent Neural Network
42

Recurrent Neural Network
43

Generative Adversarial Network (GAN)
44

Deep Reinforcement Learning
45

APPLICATIONS
46

Sentence Completion
47

Translation
48

Summarizing
49

Auto-Tagging
50

Summarizing Pictures
51

Filling in Pictures
52

Completing Pictures
53

Do you recognize them?
54

Editing Audio
55

Sign Language
56

Lip Reading
57

USING DEEP LEARNING WITH LARGE DATASETS
58

Using Deep Learning with Large Datasets
■ Deep Learning Services
■ Deep Learning Platforms
■ Deep Learning Frameworks
59

Deep Learning as a Service
■ Training Data
■ Training Model
■ Hosted on Server
60

Deep Learning Platforms
■ You provide Data
■ Training Model
■ Hosted on Server
61

Deep Learning Frameworks
■ Own Data
■ Own Algorithm ■ Host
62

Summary
■ This module addressed the following topics
– Introduction to Neural Networks
– Artificial Neuron
– Multiple Layer Neural Networks
– Network Architecture
– Illustration of Neural Networks on MNIST
– Types of Networks
– Applications
– Using Deep Learning at Scale
63