CS计算机代考程序代写 scheme matlab algorithm Assessment Task 2

Assessment Task 2
Stochastic Optimization using Population Based Incremental Learning
A/Prof Stuart Perry
School of Electrical and Data Engineering, University of Technology Sydney
UTS CRICOS 00099F

What we will cover
1. Introduction
2. What you have to do
3. How you submit your work
41082 Introduction to Data Engineering

Where we are at in the program
41082 Introduction to Data Engineering

Motivation
Optimization
• AveryimportantelementofDataEngineeringistheabilitytooptimizethings
• VeryoftenthisisdonethroughMachineLearning
• ThemotivationforthistaskistointroduceMachineLearning/StochasticOptimization using;
• A simple to learn algorithm, on
• A simple well known benchmark problem, using
• A programming language that is already familiar (Matlab)
• Havingsaidthat,thealgorithmturnsouttobeaverypowerfulandusefulone-goodto have in your toolbox.
41082 Introduction to Data Engineering

The benchmark problem
• TheproblemchosenistheTerminalAssignmentProblem
• Thisisaclassicbenchmarkusedinmanynetworkplanningcourses
• Ithasapplicationinmanyareas,includingsuchareasasSupermarket/SupplyDepot allocation
41082 Introduction to Data Engineering

The terminal assignment problem
• Thisisthetypicalsituationwherewehaveanumberofterminalsconnectedto concentrators
• or, for example Supermarkets supplied by Warehouses
• Thereisacostassociatedwitheachofthepotentialconnections
• Inaddition,therearealwaysconstraintsthatmustbemet.Inthiscase
• Each terminal must be connected to 1, and only 1 concentrator, and
• There is a maximum number of terminals that each concentrator can service
41082 Introduction to Data Engineering

Terminal / Concentrator layout
• Thelayoutshowsanumberof connections
• Thereshouldbe12asthereare12 Terminals
• Inthiscase,Terminals7,10and11 are connected to Concentrator 0, which is quite legal.
• ThetotalCostofthisconfigurationis the some of all the 12 values associated with the red dots. (See the Cost Table next.)
41082 Introduction to Data Engineering
11

The Cost Table
• Tobeexplicit,ifTerminal5isconnectedtoConcentrator3thecostwillbe35
41082 Introduction to Data Engineering

Mathematical Formulation
Connectivity Variable
• To begin the mathematical description of the problem, we will define a variable xij that describes the connectivity
𝑥𝑥 = �1 terminal 𝑖𝑖 connected to concentrator 𝑗𝑗 𝑖𝑖𝑖𝑖 0 not connected
41082 Introduction to Data Engineering

Mathematical Formulation (cont.)
Cost variable
• Wecandefineanothervariablethatdefinesthecostofsuchconnections.Thecostcan be many things, or compounded factors. It may be the actual cost in $s, or the difficulty of making the connections, degradation in speed, etc.
cij = Cost of connecting terminal i to concentrator j
41082 Introduction to Data Engineering

Mathematical Formulation (cont.)
Objective Function
• ThesetwovariablesgetcombinedforaformulawhichcalculatestheCostofaparticular connection matrix
• WecallthistheObjectiveFunction.
𝑁𝑁
𝑍𝑍 = �𝑖𝑖=1 �𝑖𝑖∈𝐽𝐽 𝑐𝑐𝑖𝑖𝑖𝑖𝑥𝑥𝑖𝑖𝑖𝑖
• Assumewewishtominimizethecost.Sowearelookingformin(Z)
41082 Introduction to Data Engineering

Mathematical Formulation (cont.)
Constraints
• Asinallsuchoptimizationproblems,thereareconstraints.
• Forexample,wemightwishtoassertthateachterminalcanconnecttoonlyone concentrator. We can do that as follows:
�𝑖𝑖∈𝐽𝐽𝑥𝑥𝑖𝑖𝑖𝑖 =1𝑓𝑓𝑓𝑓𝑓𝑓𝑖𝑖=1,2,…,𝑁𝑁
• Wemayalsowishtolimitthenumberofterminalsthateachconcentratorcanhandle.We
can do this as follows:
𝑁𝑁
�𝑖𝑖=1𝑥𝑥𝑖𝑖𝑖𝑖 ≤𝑞𝑞𝑓𝑓𝑓𝑓𝑓𝑓𝑗𝑗∈𝐽𝐽
41082 Introduction to Data Engineering

Create a program that calculates min(Z)
• YouneedtouseMatlabtowriteaprogramthatperformsaGeneticAlgorithm.
• TheGeneticAlgorithmyouneedtouseiscalledPopulationBasedIncrementalLearning (PBIL).
• Onceyourprogramisoperational,youneedtobeabletochangetheprogramvariables to find the optimum parameters needed to find a solution that is close enough to optimum.
• Youneedtoreflectontheresults,drawingconclusionsastoyoursuccess.
41082 Introduction to Data Engineering

Simple explanation of the PBIL Algorithm I
PBIL is a simple stochastic/genetic algorithm for optimizing a configuration. It requires the following:
1. A description of the solution space by means of a binary string
2. A probability vector that is the same length as the binary string, where each location indicates the likelihood of that binary digit in the solution string being a 1.
3. A means of calculating a “Cost” from that binary string using a Cost Table
4. The algorithm proceeds as follows:
41082 Introduction to Data Engineering

Simple explanation of the PBIL Algorithm – Step 1
Step 1: Creating an epoch of sample solutions
1. Create a number of sample solutions based on the probability vector.
2. This is called an epoch and could have any number of solutions. Perhaps 100.
3. Each sample solution is another of the binary strings
1. Say the Nth element of the Probability Vector is 0.5, then the Nth element of the binary string would be equally likely to be a “1” or “0”
2. Say the Nth element of the Probability Vector is 0.667, then the Nth element of the binary string would be a “1” on 66% of the trials
3. Say the Nth element of the Probability Vector is 1, then the Nth element of the binary string would always be a “1”
4. Say the Nth element of the Probability Vector is 0.0, then the Nth element of the binary string would never be a “1“
4. And so on for all the elements of the binary string
5. Check that all the solutions found do not break the constraints
41082 Introduction to Data Engineering

Simple explanation of the PBIL Algorithm – Step 2
Step 2: Working out the costs
1. Now that you have an epoch of sample solutions, which are just binary strings, you have to calculate the Cost of each solution
2. Use your cost table, with each sample solution to calculate the Cost of that solution
41082 Introduction to Data Engineering

Simple explanation of the PBIL Algorithm – Step 3
Step 3: Ranking the epoch
1. Based on the costs calculated above, find the minimum one.
2. You could do this by ranking them.
3. Now take the minimum one and use it in the next step
41082 Introduction to Data Engineering

Simple explanation of the PBIL Algorithm – Step 4
Step 4: Updating the Probability Vector
1. You now have a binary string that represents the best solution of your epoch of possible solutions
2. Use it to modify the Probability Vector so that the next epoch you create will be more similar to this best solution found in Step 3.
3. So, if element N was a “1” then the corresponding Nth element of the Probability Vector needs to be adjusted by a small amount so that in the next epoch, the Nth elements of the sample solutions are more likely to be a “1”.
4. And so on for all the elements.
41082 Introduction to Data Engineering

Simple explanation of the PBIL Algorithm – Step 5
Step 5: Go back and do it again
1. Return to Step 1: and create another epoch of solutions.
2. Proceed through Steps 2, 3 and 4 as before.
3. Inspect your Probability Vector.
4. It should start to look more like a string of “1s” and “0s” each time you go around the process.
5. When the Probability Vector has finally settled as a string of “1s” and “0s”, you know you have a solution.
41082 Introduction to Data Engineering

Detailed working of the algorithm – Start
Tabular representation
Probability Vector
P0
P1
P2

PN
Trial 0
B0,0
B0,1
B0,2

B0,N
Cost of Trial 0
Trial 1
B1,0
B1,1
B1,2

B1,N
Cost of Trial 1
Trial 2
B2,0
B2,1
B2,2

B2,N
Cost of Trial 2







Trial M
BM,0
BM,1
BM,2

BM,N
Cost of Trial M
41082 Introduction to Data Engineering

Detailed working of the algorithm – First Epoch
Typical first Epoch
Probability Vector
0.5
0.5
0.5

0.5
Cost of Trial
Trial 0
0
1
1

0
42
Trial 1
1
1
0

1
96
Trial 2
0
1
0

1
223







Trial M
1
1
1

1
31
41082 Introduction to Data Engineering

Detailed working of the algorithm – Some time during the iterations
Say iteration 45 for example
Probability Vector
0.8
0.1
0.25

0.75
Cost of Trial
Trial 0
1
1
1

0
49
Trial 1
0
0
0

1
91
Trial 2
1
1
0

0
145







Trial M
0
1
1

0
200
41082 Introduction to Data Engineering

Detailed working of the algorithm – Final iteration
Final iteration – everything has now settled down
Probability Vector
1.0
0.0
0.0

1.0
Cost of Trial
Trial 0
1
0
0

1
25
Trial 1
1
0
0

1
25
Trial 2
1
0
0

1
25







Trial M
1
0
0

1
25
41082 Introduction to Data Engineering

Binary numbering scheme
Decimal
Binary
Decimal
Binary
0
0000
8
1000
1
0001
9
1001
2
0010
10
1010
3
0011
11
1011
4
0100
12
1100
5
0101
13
1101
6
0110
14
1110
7
0111
15
1111
41082 Introduction to Data Engineering

Tips on getting started
• Don’tdivestraightintogetcoding
• Ratherdoaproperscopingofwhattheapphastodo,whattherequirementsare,etc. • Itisoftensaidthatanhourofpaperworkcansavehundredsoflinesofcode
41082 Introduction to Data Engineering

Submitting via Assignment on Canvas
1. Your work must be submitted via the Assignment on Canvas
2. Your document must include at least the following parts
1. A description of your program
2. A description of your software development process
3. A description of the “logic” of the program. (Flowcharts etc.)
4. A collection and analysis of your results.
5. A listing of your code – well commented.
41082 Introduction to Data Engineering

Criteria
Marking Rubric – 30 marks
Understanding of the problem
Testing of different scenarios for learning rate etc.
Structure and quality of the description document
Problem description
Description of the program and the PBIL algorithm
Code listing and comments in the code
Exceeds Expectations
5 marks
5 marks
5 marks
5 marks
5 marks
5 marks
Meets Expectations
3 marks
3 marks
3 marks
3 marks
3 marks
3 marks
Does Not Meet Expectations
0 marks
0 marks
0 marks
0 marks
0 marks
0 marks
41082 Introduction to Data Engineering