Coursework Description
The coursework involves the specification, design and implementation of a simple agent.
Coursework Requirements
The problem consists of a 2D environment, in which a single agent must collect and dispose of waste (CO2) from stations, e.g., carbon capture and storage. Stations periodically generate tasks – requests to dispose of a specified amount of waste. The environment also contains a number of wells where waste can be deposited. The goal of the agent is to dispose of as much waste as possible in a fixed period of time.
Task Environment
The standard task environment is defined as:
• the environment is an infinite 2D grid that contains randomly distributed stations, wells and refuelling points
• stations periodically generate tasks – requests to dispose of a specified amount of waste
• tasks persist until they are achieved (a station has at most one task at any time) • the maximum amount of waste that must be disposed of in a single task is
1,000 litres
• wells can accept an infinite amount of waste
• refuelling points contain an infinite amount of fuel
• in each run, there is always a refuelling station in the centre of the
environment
• a run lasts 10,000 timesteps
• an agent can sense only its current position (which may be a station, well or
refuelling point)
• the agent can take waste from a station and dispose of it in a well
• moving around the environment requires fuel, which the agent must replenish
at a fuel station
• the agent can carry a maximum of 100 litres of fuel and 1,000 litres of waste • the agent starts out in the centre of the environment (at the fuel station) with
100 litres of fuel and no waste
• the agent moves at 1 cell / timestep and consumes 1 litre of fuel / cell
• filling the fuel and waste tanks and delivering waste to a well takes one
timestep
• if the agent runs out of fuel, it can do nothing for the rest of the run
• the success (score) of an agent in the task environment is determined by the
amount of waste delivered
The task environment should not be modified or extended. All other decisions regarding software design and implementation strategy are up to you.
You must implement an agent that completes the task in the specified task Environment.
Task environment in detail 1
• the environment is discrete and consists of a grid of cells
• the environment contains randomly distributed stations, wells and refuelling points • stations periodically generate tasks – requests to dispose of a specified amount of waste (max 5,000 litres)
• tasks persist until they are achieved (a station has at most one task at any time)
• wells contain can accept an infinite amount of waste
• refuelling points contain an infinite amount of fuel
Task environment in detail 2
• the agent can see any stations, wells and refuelling points within 25 cells of its current position • if a station is visible, the agent can see if it has a task, and if so, how much waste is to
be disposed of
• the agent can carry a maximum of 100 litres of fuel and 1,000 litres of waste
• the agent moves at 1 cell / timestep and consumes 1 litre of fuel / cell
• filling the fuel and waste tanks and disposing of waste in a well takes one timestep
• if the agent runs out of fuel, it can do nothing for the rest of the run
Task environment in detail 3
• in each run, there is always a refuelling point in the centre of the environment
• the agent starts out in the centre of the environment (at the fuel station) with 100 litres
of fuel and no waste
• arunlasts10,000timesteps
• the success (score) of the agent in the task environment is determined by the total amount of waste delivered
Resources
A Java demo agent package is provided as a starting point for your project work. This provides an implementation of the standard task environment and a very basic agent that chooses actions at random.
• Java agent package as a starting point for your project work
– implementation of the task environment which generates a random set of stations, wells and refuelling points for each run, and periodically generates tasks
– an abstract agent class which provides methods for sensing and acting
– a concrete ‘demo agent’, that chooses actions at random
– all you have to do is (re)write the action selection function …
Evaluation
• state the performance of your agent, i.e., what score does it achieve (on average) • explain why your solution is (or is not) appropriate for the task environment, e.g.:
– explain which features of the task environment are critical for your solution to work well – explain how would you expect your agent to perform in different task environments
• once you have made the high-level decisions, think about how each aspect of the agent could/should be implemented
• e.g., how will the agent search the environment, or decide what what to do next
• will your agent always use the same action selection function, or will the action selection function vary with time, etc.
• you can use algorithms from agent case studies in the lectures, from previous AI courses
or AI textbooks, or invent your own solution
Efficient exploitation
• which task to do next – arbitrary choice (first one, random choice, etc,..)? – evaluate alternatives (closest, largest amount of waste, etc?)
• how to collect waste for a task
– opportunistically, or when required? – which makes best use of time/fuel?
• when a new task is discovered
– should the agent do it now
– add it to the list of tasks?
– re-evaluate which task to do next? • which task is “best”?
Choosing tours
• we can think of a trip which completes one or more tasks as a tour • we can then reformulate the problem as which tour is “best”?
– where should the tour begin/end?
– in which order should the agent visit stations/wells/fuel pump?
– how long should the tour be?
Improving a tour
• one way to plan is to start with a single-task tour and ask: “can this be improved?” • what do we mean by “improved”?
• one possible definition: a tour can be improved if the agent can get a
better outcome for a little extra effort
– what do we mean by “better outcome”? – what do we mean by “effort”?
– how much is “a little”?
• can we quantify the improvement?