MA4601/MAT061 Stochastic Search and Optimisation
Assignment 5: Markov Decision Processes Due 12:00 mid-day, Thursday 7th May
In this assignment we will use Markov Decision Processes to develop an optimal strategy for harvesting salmon. The problem is based on the paper “Optimal harvest strategies for salmon in relation to environmental variability and uncertainty about production parameters” by C.J. Walters (1975). A copy of the paper is availble on the course web site.
You will need to submit two files: a programme file titled YOUR NAME programme.r (or .py, .jl, etc.) and a report as a pdf file titled YOUR NAME report.pdf. Submission by email to joneso18@cardiff.ac.uk. The report should be presented as a stand-alone document that can be understood without having to read your code. It should be no more than four pages long.
Consider a salmon fishery. Each year we catch some number of fish and leave the rest to spawn. The state of the system in any season is the size of the population x, and the action taken is the size of the population left for spawning y. The immediate reward (without discount factor) is then the number of fish caught, namely x − y. At the beginning of the next season the salmon population will be W where, for constants a and b and Z ∼ N (0, 1),
W ∼ ⌊aye−byaZ⌋.
Here ⌊x⌋ is the floor of x, that is the largest integer less than or equal to x.
For the assignment do the following
1. From Walters (1975) determine values for a and b. Note that they may depend on the chosen unit of measurement.
2. For discount factor γ = 1 find the optimal policy for time horizon N = 0, 1, 20. Comment on any pattern(s) that you see.
3. For a variety of discount factors γ ∈ (0,1) find the optimal policy for an infinite time horizon. How do these policies compare to the finite time policies?
Marks will be allocated on the following basis:
50% Code correctness (how well does it work).
25% Quality of analysis (what have we learnt about harvesting salmon). 25% Clarity of report.
1