CS计算机代考程序代写 AWS AI algorithm Neural Networks

Neural Networks
1. Introduction Spring 2021
1

Logistics: By now you must have…
• Already watched lecture 0 (logistics) – Ifnotdosoatonce
• Been to the course website
– http://deeplearning.cs.cmu.edu
– If you have not done so, please visit it at once
• Course objectives, logistics, quiz and homework policies, and grading policies, all have been explained in both, the logistics lecture and on the course page
• Please familiarize yourself with this information at once
2

Logistics: Part 2
• You should already have
– Signed on to piazza
– Verified you have access to canvas and autolab
– Ensured you have AWS accounts setup • AndtestedoutGooglecolab
• You have received a note on forming study groups
– We recommend this; you learn better in teams than you do by yourself
3
– Please sign up for the study groups immediately!!!!!!!!!!

Course philosophy
• No student left behind
• Please use the available resources
– TAs
– Study groups and TA mentors
– Dozens of office hours weekly
– Me (email me, or just walk into my office if I’m free)
– Your classmates and friends
• If under stress/unable to perform, please reach out
– To your TA mentor
– Tome
– We will do our best to help you
• In our ideal world every student will earn an A
4

Attendance
• Sections A/B/M: We will use in-class polls to verify attendance
– Multiple polls posted at random times through the class – Polls will be posted on piazza
• Pleasekeepyourpiazza(andonlyyourpiazza)open – You must respond to all polls
• Wedon’tscoreyouoncorrectness,onlyonwhetheryouresponded
• Kigali/SV and students who have permission to view videos
instead: Please watch mediatech videos – We will gather your attendance from there
5

A minute for questions…
Caveat: Slide deck often have many “hidden” slides that will not be shown during
the lecture, but will feature in your weekly quizzes
6

Neural Networks are taking over!
• Neural networks have become one of the main approaches to AI
• They have been successfully applied to various pattern recognition, prediction, and analysis problems
• In many problems they have established the state of the art
– Often exceeding previous benchmarks by large margins
– Sometimes solving problems you couldn’t solve using earlier ML methods
7

Breakthroughs with neural networks
8

Breakthrough with neural networks
9

Image segmentation and recognition
10

Image recognition
https://www.sighthound.com/technology/
11

Breakthroughs with neural networks
12

Success with neural networks
• Captions generated entirely by a neural network
13

Breakthroughs with neural networks
ThisPersonDoesNotExist.com uses AI to generate endless fake faces
– https://www.theverge.com/tldr/2019/2/15/18226005/ai-generated- fake-people-portraits-thispersondoesnotexist-stylegan
14

Successes with neural networks
• And a variety of other problems:
– From art to astronomy to healthcare.. – and even predicting stock markets!
15

Voice signal
So what are neural networks??
Image
N.Net
• What are these boxes?
Game State
Next move
Transcription
Text caption
N.Net
N.Net
16

So what are neural networks??
• It begins with this..
17

So what are neural networks??
• Or even earlier.. with this..
“The Thinker!”
by Augustin Rodin
18

The magical capacity of humans
• Humanscan – Learn
– Solve problems
– Recognize patterns – Create
– Cogitate
–…
• Worthyofemulation
• Buthowdohumans“work“?
Dante!
19

Cognition and the brain..
• “If the brain was simple enough to be understood – we would be too simple to understand it!”
– Marvin Minsky
20

Early Models of Human Cognition
• Associationism
– Humans learn through association
• 400BC-1900AD: Plato, David Hume, Ivan Pavlov.. 21

What are “Associations”
• Lightning is generally followed by thunder
– Ergo – “hey here’s a bolt of lightning, we’re going to hear
thunder”
– Ergo – “We just heard thunder; did someone get hit by lightning”?
• Association!
22

A little history : Associationism
• Collection of ideas stating a basic philosophy:
– “Pairs of thoughts become associated based on the organism’s
past experience”
– Learning is a mental process that forms associations between temporally related phenomena
• 360 BC: Aristotle
– “Hence, too, it is that we hunt through the mental train, excogitating from the present or some other, and from similar or contrary or coadjacent. Through this process reminiscence takes place. For the movements are, in these cases, sometimes at the same time, sometimes parts of the same whole, so that the subsequent movement is already more than half accomplished.“
• In English: we memorize and rationalize through association
23

Aristotle and Associationism
• Aristotle’sfourlawsofassociation:
– The law of contiguity. Things or events that occur
close together in space or time get linked together
– The law of frequency. The more often two things or events are linked, the more powerful that association.
– The law of similarity. If two things are similar, the thought of one will trigger the thought of the other
– The law of contrast. Seeing or recalling something may trigger the recollection of something opposite.
24

A little history : Associationism
• Morerecentassociationists(upto1800s):John Locke, David Hume, David Hartley, James Mill, John Stuart Mill, Alexander Bain, Ivan Pavlov
– Associationist theory of mental processes: there is only one mental process: the ability to associate ideas
– Associationist theory of learning: cause and effect, contiguity, resemblance
– Behaviorism (early 20th century) : Behavior is learned from repeated associations of actions with feedback
– Etc.
25

• Butwherearetheassociationsstored?? • And how?
26

But how do we store them? Dawn of Connectionism
David Hartley’s Observations on man (1749)
• We receive input through vibrations and those are transferred
to the brain
• Memories could also be small vibrations (called vibratiuncles) in the same regions
• Our brain represents compound or connected ideas by connecting our memories with our current senses
• Current science did not know about neurons
27

Observation: The Brain
• Mid 1800s: The brain is a mass of interconnected neurons
28

Brain: Interconnected Neurons
• Many neurons connect in to each neuron
• Each neuron connects out to many neurons
29

Enter Connectionism
• Alexander Bain, philosopher, psychologist, mathematician, logician, linguist, professor
• 1873: The information is in the connections – Mind and body (1873)
30

Bain’s Idea 1: Neural Groupings • Neurons excite and stimulate each other
• Different combinations of inputs can result in different outputs
31

Bain’s Idea 1: Neural Groupings
• Different intensities of activation of A lead to the differences in when X and Y are activated
• Even proposed a learning mechanism..
32

Bain’s Idea 2: Making Memories
• “when two impressions concur, or closely succeed one another, the nerve-currents find some bridge or place of continuity, better or worse, according to the abundance of nerve- matter available for the transition.”
• Predicts “Hebbian” learning (three quarters of a century before Hebb!)
33

Bain’s Doubts
• “The fundamental cause of the trouble is that in the modern world the stupid are cocksure while the intelligent are full of doubt.”
– BertrandRussell
• In 1873, Bain postulated that there must be one million neurons and
5 billion connections relating to 200,000 “acquisitions”
• In 1883, Bain was concerned that he hadn’t taken into account the number of “partially formed associations” and the number of neurons responsible for recall/learning
• By the end of his life (1903), recanted all his ideas!
– Too complex; the brain would need too many neurons and connections
34

Connectionism lives on..
• The human brain is a connectionist machine
– Bain, A. (1873). Mind and body. The theories of their
relation. London: Henry King.
– Ferrier, D. (1876). The Functions of the Brain. London:
Smith, Elder and Co
• Neurons connect to other neurons. The processing/capacity of the brain is a function of these connections
• Connectionist machines emulate this structure
35

Connectionist Machines
• Networkofprocessingelements
• Allworldknowledgeisstoredintheconnections between the elements
36

Connectionist Machines
• Neural networks are connectionist machines – As opposed to Von Neumann Machines
Von Neumann/Princeton Machine
Neural Network
PROGRAM
DATA
PROCESSOR
Processing Memory unit
• The machine has many non-linear processing units
– The program is the connections between these units • Connectionsmayalsodefinememory
NETWORK
37

Recap
• Neural network based AI has taken over most AI tasks
• Neural networks originally began as computational models
of the brain
– Or more generally, models of cognition
• The earliest model of cognition was associationism
• The more recent model of the brain is connectionist
– Neurons connect to neurons
– The workings of the brain are encoded in these connections
• Current neural network models are connectionist machines
38

Poll 1 (on piazza)
39

Connectionist Machines
• Network of processing elements
• All world knowledge is stored in the connections between
the elements
• Multiple connectionist paradigms proposed..
40

• •
Turing’s Connectionist Machines
Basic model: A-type machines
– Random networks of NAND gates, with no learning mechanism
• “Unorganized machines”
Connectionist model: B-type machines (1948)
– Connection between two units has a “modifier” • Whose behaviour can be learned
– If the green line is on, the signal sails through
– If the red is on, the output is fixed to 1
– “Learning” – figuring out how to manipulate the coloured wires
• Done by an A-type machine
41

Connectionist paradigms: PDP Parallel Distributed Processing
• Requirements for a PDP system
(Rumelhart, Hinton, McClelland, ‘86; quoted from Medler, ‘98)
– Asetofprocessingunits
– Astateofactivation
– Anoutputfunctionforeachunit
– Apatternofconnectivityamongunits
– Apropagationruleforpropagatingpatternsofactivitiesthroughthe network of connectivities
– Anactivationruleforcombiningtheinputsimpingingonaunitwith the current state of that unit to produce a new level of activation for the unit
– Alearningrulewherebypatternsofconnectivityaremodifiedby experience
– Anenvironmentwithinwhichthesystemmustoperate
42

Connectionist Systems
• Requirements for a connectionist system (Bechtel and Abrahamson, 91)
– The connectivity of units
– The activation function of units
– The nature of the learning procedure that modifies the connections between units, and
– How the network is interpreted semantically
43

Connectionist Machines
• Network of processing elements
– All world knowledge is stored in the connections between
the elements
• But what are the individual elements?
44

• What are the units?
• A neuron:
Dendrites
Soma
Modelling the brain
Axon
• Signals come in through the dendrites into the Soma
• A signal goes out via the axon to other neurons – Onlyoneaxonperneuron
• Factoid that may only interest me: Neurons do not undergo cell division
– Neurogenesisoccursfromneuronalstemcells,andisminimalafter birth
45

McCulloch and Pitts
• The Doctor and the Hobo..
– Warren McCulloch: Neurophysiologist
– Walter Pitts: Homeless wannabe logician who arrived at his door
46

The McCulloch and Pitts model
A single neuron
• Amathematicalmodelofaneuron
– McCulloch, W.S. & Pitts, W.H. (1943). A Logical Calculus of the Ideas Immanent in Nervous Activity, Bulletin of Mathematical Biophysics, 5:115-137, 1943
• Pitts was only 20 years old at this time
47

Synaptic Model
• Excitatory synapse: Transmits weighted input to the neuron
• Inhibitory synapse: Any signal from an inhibitory synapse prevents
neuron from firing
– Theactivityofanyinhibitorysynapseabsolutelypreventsexcitationof the neuron at that time.
• Regardless of other inputs
48

Simple “networks”
of neurons can perform Boolean operations
Boolean Gates
49

Complex Percepts & Inhibition in action
Heat receptor
They can even create illusions of “perception”
Cold receptor
Heat sensation
Cold sensation
50

McCulloch and Pitts Model
• Could compute arbitrary Boolean propositions
– Since any Boolean function can be emulated, any Boolean function can be composed
• Models for memory
– Networks with loops can “remember”
• We’llseemoreofthislater
– Lawrence Kubie (1930): Closed loops in the central nervous system explain memory
51

Criticisms
• Theyclaimedthattheirnets
– Should be able to compute a small class of functions
– Also, if tape is provided their nets can compute a richer class of functions.
• Additionally they will be equivalent to Turing machines • Dubious claim that they’re Turing complete
– They didn’t prove any results themselves • Didn’tprovidealearningmechanism..
52

Donald Hebb
• “Organization of behavior”, 1949 • A learning mechanism:
– “When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.”
• As A repeatedly excites B, its ability to excite B improves
– Neurons that fire together wire together
53

Hebbian Learning
• If neuron connecting
repeatedly triggers neuron to gets larger
Axonal connection from neuron X
Dendrite of neuron Y , the synaptic knob
• In a mathematical model:
􏰀􏰁 􏰀􏰁
– Weight of the connection from input neuron to output neuron
• This simple formula is actually the basis of many learning algorithms in ML
54

Hebbian Learning
• Fundamentally unstable
– Stronger connections will enforce themselves
– No notion of “competition”
– No reduction in weights
– Learning is unbounded
• Number of later modifications, allowing for weight normalization, forgetting etc.
– E.g. Generalized Hebbian learning, aka Sanger’s rule 􏰃
􏰂􏰃􏰂􏰃􏰃􏰂 􏰂􏰄􏰄 􏰄􏰅􏰆
– The contribution of an input is incrementally distributed over multiple outputs..
55

Poll 2
56

A better model
• Frank Rosenblatt
– Psychologist,Logician
– Inventorofthesolutiontoeverything,akathePerceptron(1958)
57

Rosenblatt’s perceptron
• Original perceptron model
– Groups of sensors (S) on retina combine onto cells in association
area A1
– Groups of A1 cells combine into Association cells A2
– Signals from A2 cells combine into response cells R
– All connections may be excitatory or inhibitory
58

Rosenblatt’s perceptron
• Even included feedback between A and R cells – Ensures mutually exclusive outputs
59

Perceptron: Simplified model
• Number of inputs combine linearly
– Threshold logic: Fire if combined input exceeds threshold
60

The Universal Model
• Originally assumed could represent any Boolean circuit and perform any logic
– “the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence,” New York Times (8 July) 1958
– “Frankenstein Monster Designed by Navy That Thinks,” Tulsa, Oklahoma Times 1958
61

Also provided a learning algorithm
Sequential Learning:
is the desired output in response to input
is the actual output in response to
• Boolean tasks
• Update the weights whenever the perceptron output is
wrong
– Update the weight by the product of the input and the
error between the desired and actual outputs
• Proved convergence for linearly separable classes
62

X
Y
1
2 1
-1
X
0
X
Y
1
Perceptron
1 1
• Easily shown to mimic any Boolean gate • But…
Values shown on edges are weights, numbers in the circles are thresholds
63

Perceptron
No solution for XOR! Not universal!
X
Y
• Minsky and Papert, 1968
?
? ?
64

A single neuron is not enough
• Individual elements are weak computational elements – Marvin Minsky and Seymour Papert, 1969, Perceptrons:
An Introduction to Computational Geometry
• Networked elements are required
65

Multi-layer Perceptron!
X
1 -1
1
1
2 1
1 -1
Y
• XOR
-1
Hidden Layer
– The first layer is a “hidden” layer
– Also originally suggested by Minsky and Papert 1968
66

A more generic model
1 21 1 1-1 11
2111211 -111-121 111
XYZA
• A “multi-layer” perceptron
• Can compose arbitrarily complicated Boolean functions!
– Incognitiveterms:CancomputearbitraryBooleanfunctionsover sensory input
– Moreonthisinthenextclass
01
1
67

Story so far
• Neural networks began as computational models of the brain
• Neural network models are connectionist machines
– The comprise networks of neural units
• McCullough and Pitt model: Neurons as Boolean threshold units
– Models the brain as performing propositional logic
– But no learning rule
• Hebb’s learning rule: Neurons that fire together wire together – Unstable
• Rosenblatt’s perceptron : A variant of the McCulloch and Pitt neuron with a provably convergent learning rule
– But individual perceptrons are limited in their capacity (Minsky and Papert) • Multi-layer perceptrons can model arbitrarily complex Boolean functions
68

But our brain is not Boolean
• We have real inputs
• Wemakenon-Booleaninferences/predictions
69

The perceptron with real inputs x1
x2 x3
xN
• x1…xN are real valued
• w1…wN are real valued
• Unit“fires”ifweightedinputmatches(orexceeds) a threshold
70
􏰂􏰂 􏰂

The perceptron with real inputs x1
x2 x3
xN
• Alternate view:
– Athreshold“activation” plus a bias
operatesontheweightedsumofinputs
􏰂􏰂 􏰂
–
• An affine function of the inputs
outputs a 1 if z is non-negative, 0 otherwise
• Unit “fires” if weighted input matches or exceeds a threshold
71

The perceptron with real inputs and a real output
b
x1 x2
x3 xN
sigmoid
􏰂􏰂 􏰂
• x1…xN are real valued
• w1…wN are real valued
• The output y can also be real valued
– Sometimes viewed as the “probability” of firing
72

The “real” valued perceptron
x1 x2
x3 xN
b
f(sum)
• Any real-valued “activation” function may operate on the affine function of the input
– Wewillseeseverallater – Outputwillberealvalued
• The perceptron maps real-valued inputs to real-valued outputs
• Is useful to continue assuming Boolean outputs though, for interpretation
73

A Perceptron on Reals
x1 1 x2
x3 x xN
w1x1+w2x2=T
0
2
x1
􏰂􏰂 􏰂
• Aperceptronoperateson real-valued vectors
– This is a linear classifier
x2
x1
74

Boolean functions with a real perceptron
0,1 1,1 0,1 1,1 0,1 1,1
x1 x1 x1
0,0 x2 1,0 0,0 x2 1,0 0,0 x2 1,0
• Booleanperceptronsarealsolinearclassifiers – Purple regions have output 1 in the figures
– What are these functions
– Why can we not compose an XOR?
75

Composing complicated “decision” boundaries
x2
x1
Can now be composed into “networks” to compute arbitrary classification “boundaries”
• Build a network of units with a single output that fires if the input is in the coloured area
76

Booleans over the reals
x2
x1
• The network must fire if the input is in the coloured area
x2 x1
77

Booleans over the reals
x2
x1
• The network must fire if the input is in the coloured area
x2 x1
78

Booleans over the reals
x2
x1
• The network must fire if the input is in the coloured area
x2 x1
79

Booleans over the reals
x2
x1
• The network must fire if the input is in the coloured area
x2 x1
80

Booleans over the reals
x2
x1
• The network must fire if the input is in the coloured area
x2 x1
81

Booleans over the reals
3
4 3
4
3
􏰇
􏰂 􏰂􏰅􏰆
x
x
5
4
4
3
AND y1 y2 y3 y4
x2
2
2
x
y5
x1
x
4 3
1
1
• The network must fire if the input is in the coloured area
82

More complex decision boundaries
OR
AND
AND
x2
x1
x1
x2
• Networktofireiftheinputisintheyellowarea – “OR” two polygons
– A third layer is required
83

Complex decision boundaries
• Cancomposeverycomplexdecisionboundaries – How complex exactly? More on this in the next class
84

Complex decision boundaries
784 dimensions (MNIST)
2
784 dimensions
• Classification problems: finding decision boundaries in high-dimensional space
– Can be performed by an MLP
• MLPs can classify real-valued inputs
85

Story so far
• MLPs are connectionist computational models
– Individual perceptrons are computational equivalent of neurons – The MLP is a layered composition of many perceptrons
• MLPs can model Boolean functions
– Individual perceptrons can act as Boolean gates – Networks of perceptrons are Boolean functions
• MLPs are Boolean machines
– They represent Boolean functions over linear boundaries – They can represent arbitrary decision boundaries
– They can be used to classify data
86

Poll 3
87

But what about continuous valued
outputs?
• Inputs may be real-valued
• Can outputs be continuous-valued too?
88

MLP as a continuous-valued regression
T1 T1
x
1
1
+
f(x) T1 T2 x
1 T2 -1 T2
• A simple 3-unit MLP with a “summing” output unit can generate a “square pulse” over an input
– Output is 1 only if the input lies between T1 and T2 – T1 and T2 can be arbitrarily specified
89

MLP as a continuous-valued regression
h􏰈 h􏰆
h􏰉
x
× h􏰉
T1 T1
x
1
1
f(x) T1 T2 x
1 T2 -1 T2
× h􏰆
× h􏰈
+
• A simple 3-unit MLP can generate a “square pulse” over an input
• An MLP with many units can model an arbitrary function over an input
– To arbitrary precision
• Simply make the individual pulses narrower
• This generalizes to functions of any number of inputs (next class)
90

Poll 4
91

Story so far
• Multi-layerperceptronsareconnectionist computational models
• MLPsareclassificationengines
– They can identify classes in the data
– Individual perceptrons are feature detectors
– The network will fire if the combination of the detected basic features matches an “acceptable” pattern for a desired class of signal
• MLPcanalsomodelcontinuousvaluedfunctions92

Other things MLPs can do
• Model memory
– Loopy networks can “remember” patterns
• Proposed by Lawrence Kubie in 1930, as a model for memory in the CNS
• Represent probability distributions
– Over integer, real and complex-valued
domains
– MLPs can model both a posteriori and a priori distributions of data
• A posteriori conditioned on other variables
– MLPs can generate data from complicated,
or even unknown distributions
• They can rub their stomachs and pat their heads at the same time..
93

NNets in AI
• The network is a function
– Given an input, it computes the function layer
wise to predict an output
• More generally, given one or more inputs, predicts one or more outputs
94

Voice signal
These tasks are functions
Image
Text caption
N.Net
Game State
Next move
Transcription
• Each of these boxes is actually a function – E.g f: Image  Caption
N.Net
N.Net
95

Voice signal
These tasks are functions
Image
Text caption
Transcription
Game State
Next move
• Each box is actually a function
– E.g f: Image  Caption
– It can be approximated by a neural network
96

Story so far
• Multi-layerperceptronsareconnectionist computational models
• MLPsareclassificationengines
• MLPcanalsomodelcontinuousvalued
functions
• InterestingAItasksarefunctionsthatcanbe modelled by the network
97

Next Up
• Moreonneuralnetworksasuniversal approximators
– And the issue of depth in networks
98

Related Posts