CS考试辅导 MA4601/MAT061 Stochastic Search and Optimisation

MA4601/MAT061 Stochastic Search and Optimisation
Assignment 5: Markov Decision Processes
Due 12:00 mid-day, 18th this assignment we will use Markov Decision Processes to develop an optimal strategy for harvesting salmon. The problem is based on the paper “Optimal harvest strategies for salmon in relation to environmental variability and uncertainty about production parameters” by C.J. Walters (1975). A copy of the paper is available on the course web site.
You will need to submit two files: a programme file titled YOUR NAME assign5.r or .py and a report as a pdf file titled YOUR NAME assign5.pdf.

Copyright By PowCoder代写 加微信 powcoder

The report should be typed in a 12 point font and should be no more than four pages long (ex- cluding figures). You must express yourself in your own words and acknowledge your sources: see the university rules on academic misconduct
Your code must be submitted as a single file, and must be properly documented, including clear instructions on how to run the code. Data files may be included separately. See the module website for restrictions on the external libraries you can use, and note that jupyter notebooks are not accepted.
Note that your assessment is based on the report! You must submit your code, and I will look at it to check how well you have documented it, but I expect you to explain what you have done and present your results in the report. I will usually only run your code if it is not clear what is going on from the report, in which case you will probably be losing marks. Of course, should I choose to run your code, it must be capable of reproducing the results presented in the report.
Consider a salmon fishery. Each year we catch some number of fish and leave the rest to spawn. The state of the system in any season is the size of the population x, and the action taken is the quantity of fish harvested y (more precisely, the quantity of fish we would like to harvest). The immediate reward (without discount factor) is then the number of fish caught, namely min{x, y}. The population of fish remaining after harvesting is W0 = max{0, x − y} and at the beginning of the next season the salmon population will be W where, for α ∼ N(α,σ2) and population capacity m,
W ∼ ⌊W0eα(1−W0/m)⌋
Here ⌊x⌋ is the floor of x, that is the largest integer less than or equal to x. Measure the population and catch size in units of 10,000 fish.
Note that because α is unbounded it is possible to have W > m. We will assume that W ≤ m + 20; values larger than m + 20 are truncated.
1. (2 marks) From Walters (1975) determine values for α, σ and m. Make sure that they agree with the specified unit of measurement.
This is all that is required from the paper by Walters. In particular do not attempt to emulate his analysis. You are required to use the state variables and actions defined above.

2. (4 marks) For i, j ≥ 0 put
Express pi,j as the probability that α is in an interval. Using this expression write a
pi,j =P(W =j|W0 =i). function to calculate pi := (pi,0, pi,1, . . . , pi,m+20).
3. (6 marks) For discount factor γ = 1 find the optimal policy for time horizon N = 0,1,…,20.
Plot each policy as a graph of y (number of fish harvested) against x (population). Com- ment on any pattern(s) that you see.
4. (6 marks) For discount factors γ = 0.8, 0.9, 0.95, 0.99 find the optimal policy for an infinite time horizon.
Plot each policy as a graph of y (number of fish harvested) against x (population). How do these policies compare to the finite time policy from Question 3?
2 marks are reserved for presentation, clarity of expression, and documentation of code.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com