CS作业代写 COMP20003 Algorithms and Data Structures @ Semester 2, 2022

Assignment 2: PR Quadtrees

Copyright By PowCoder代写 加微信 powcoder

You must read fully and carefully the assignment speci�cation and instructions.

Course: COMP20003 Algorithms and Data Structures @ Semester 2, 2022

Deadline Submission: Friday 9th September 2022 @ 11:59 pm (end of Week 7)

Course Weight: 15%

Assignment type: individual
ILOs covered: 2, 3, 4

Submission method: via ED

The purpose of this assignment is for you to:

Improve your pro�ciency in C programming and your dexterity with dynamic memory
allocation.

Demonstrate understanding of a concrete data structure (quadtree) and implement a set of
algorithms.

Practice multi-�le programming and improve your pro�ciency in using UNIX utilities.

Walkthrough

An error occurred.

Try watching this video on www.youtube.com, or enable JavaScript if it is disabled in your browser.

Introduction to PR Quadtrees

Introduction

A quadtree is a data structure that stores -dimensional points and enables e�cient search for the
stored points. We will only consider the case . One particular quadtree which can be used to
store -dimensional points is the point-region quadtree, simply referred as a PR quadtree. A binary
tree can be de�ned as a �nite set of nodes that are either empty or have a root and two binary trees

and (the left and right subtree). A quadtree has a similar recursive de�nition: instead of two
subtrees we have four subtrees, hence the name quad. This means that each node either has four
children or is a leaf node. The four leaf nodes are often referred to as NW, NE, SW, SE (see the �gure

Building a PR Quadtree (Inserting Datapoints)

In a PR quadtree, each node represents a rectangle that covers part of the area that we wish to index.
The root node covers the entire area. A PR quadtree is built recursively: we �rst start with an initial
rectangle that should contain all the points we wish to store (how could you �nd a rectangle for –
dimensional points so that it contains all given datapoints?). To then construct a PR quadtree of a set
of points, we insert the points one by one and recursively subdivide a rectangle into four equally
sized sub-rectangles whenever a rectangle would contain more than a single point. After the insertion
process is complete, every leaf node is either coloured in black (contains a single datapoint) or white
(indicating that the node is empty), and internal nodes are coloured in grey (see the �gure above).

Alternatively, you could also use the following strategy: we insert all points into the rectangle and
recursively partition each rectangle into 4 quadrants (sub-rectangles) until each rectangle contains at
most one point.

The construction of a PR quadtree also leads to one important di�erence between them and binary
trees: a binary tree depends on the order of value insertion whereas a PR quadtree is independent of
the order of datapoint insertion. The reason is that the nodes are independent from the inserted
datapoints (the rectangles are determined by the initial rectangle of the root node).

To enable e�cient search, the rule is that every leaf node in a PR quadtree either contains a single point
or no point at all. Every node in a PR quadtree is either a leaf node or an internal node, i.e., has four
children representing a subdivision of the current rectangle into four equally sized quadrants. The
region of an internal node always contains more than one point.

If we search for a single datapoint using its coordinates, we �rst check if the point lies in the
quadtree. If it does, we recursively compute on each level the sub-rectangle that contains the
coordinates of the datapoint to be searched until we reach a leaf node. The leaf node either contains
the datapoint or it does not.

Here is a running example where we start with an empty PR quadtree, i.e., the we have a single white
node, indicating we one leaf that is empty:

We insert the �rst point . We can simply colour the root node as black as we have only a single

We insert a second point . Note that the colour changes from black to grey as the root node
becomes an internal node (a leaf node can only hold a single datapoint but now holds two):

Since the root is an internal node, we have to subdivide the rectangle into four sub-rectangles and
have to add four children in the corresponding quadtree. However, since both points are still located
in the same sub-rectangle NW, we colour this node as grey as well in the corresponding quadtree
which has now a root and four leaves:

Hence, we subdivide the NW rectangle further into four sub-rectangles. We add four more children
nodes to the grey node that we coloured as grey in the previous step:

Since this subdivision again does not lead to and being in separate rectangles, we have to add
another layer to the quadtree, i.e., subdivide the rectangle that includes both points once more into
four rectangles. Since and are now in separate rectangles, the corresponding nodes become
leaves in the quadtree coloured in black and the two other children are simply coloured white. We
have now completed the update of inserting :

Finally, we insert a third point . This is a simple case, because the leaf on the �rst level is empty,
i.e., we simply colour the SE node on the �rst level of the quadtree as black:

Further References

Here is a good introduction to PR quadtrees that has some animated visualisations: link.

You might also �nd this demonstration useful in deepening your understanding: link.

Your implementation of a PR quadtree has to support a number of functions to store data points and
to search for them. You will need to implement functions to support inserting data points, searching
for individual datapoints and searching (returning) for all datapoints within a query rectangle.

To insert a data point into a PR quadtree we start with the root node. If the PR quadtree is empty, i.e.,
the root node is coloured as white (which we represent as a pointer to NULL) and the point lies within
the area that should be covered with the PR quadtree, the data point becomes the root. If the root is
an internal node, we compute in which of the four children the data points lies (of course, if the data
point lies outside of root rectangle, it is not inserted and an error message returned). Again, if the
corresponding node is an internal node, we recursively repeat this process until we �nd a leaf node. If
the leaf node is empty, we simply insert the point. If the leaf node is black, i.e., has already one node,
we need to split the leaf node into four children, i.e., the existing black leaf node becomes a grey
internal node. We repeat splitting the node until we �nd an internal node so that two of its children
contain the existing and the newly inserted datapoint, i.e., are black leaf nodes. In other words:
splitting a black leaf can also led to a recursive process.

Find (Search)

To �nd a datapoint (and often its associated information), we simply traverse the PR quadtree by
selecting the internal child node whose rectangle contains the location of the datapoint. The search
stops once we have reached a leaf node. If the leaf node has stored the datapoint, we return its
associated information; otherwise the search simply reports that the datapoint is not stored in the PR

Range (Window) Query

An important query is to report all datapoints whose locations are contained in a given query
rectangle. This query is typically called a range or window query. To run range query on a quadtree,
we have the following recursive process: we start the root and check if the query rectangle overlaps
with the root. If it does, we check which children nodes also overlap with query rectangle (this could
be all of them for a centred query rectangle). Then we repeat this procedure for all internal children
overlapping with the query rectangle. The recursion stop once we have reached the leaf level. We
return the datapoints from all black leaf nodes whose datapoints are located within the query
rectangle.

Implementation Details

To implement your own PR quadtree, you will need to set up some data structures and functions, you
will likely need a number of structures and helper functions, these are not required and you do not
need to follow the naming style or function as described, but you will likely �nd them useful in your
implementation. As a minimum you will likely have the following data structures:

point2D that stores the location of datapoint;

rectangle2D that speci�es a rectangle given an bottom-left 2D point and a upper-right 2D

dataPoint , which is a structure that stores the location and associated information (i.e., the
footpath information);

quadtreeNode , which stores a 2D rectangle and 4 pointers referring to the four children SW,
NW, NE and SE.

You will also likely need the following functions:

inRectangle : tests whether a given 2D point lies within the rectangle and returns 1 (TRUE) if it
does. The function takes the point and the rectangle; note that the convention is that points are
within a rectangle if they are on lower and right boundary but not on the top or left boundary.
This enables us to construct a well-de�ned partition of the space if points lie on boundaries of
the quadtree node rectangle.

rectangleOverlap : tests whether two given rectangles overlap and returns 1 (TRUE) if they do.

determineQuadrant : given a rectangle and point, returns the quadrant of the rectangle that the
point lies in. The convention is 0=SW , 1=NW , 2=NE , 3=SE for the quadrant. You may like to
implement helper functions which also select relevant pointers from the quadtreeNode given a
quadrant. You may �nd writing an enum quadrant useful for implementing and using this

addPoint : this function will add a datapoint given with its 2D coordinates to the quadtree. You
will need to setup the logic discussed before.

searchPoint : tests whether a datapoint given by its 2D coordinates lies within the rectangle
and returns the datapoint (and its stored information) if it does and NULL otherwise.

rangeQuery : takes a 2D rectangle as argument and returns all datapoints in the PR quadtree
whose coordinates lie within the query rectangle.

Tip: if you have begun implementing and are stuck, check this page, you will likely need one of these functions.

Stage 3 – Supporting Point Region Queries

In Stage 3, you will implement the basic functionality for a quadtree allowing the lookup of data by

longitude (x) and latitude (y) pairs.

Your Makefile should produce an executable program called dict3 . This program should take
seven command line arguments.

1. The �rst argument will be the stage, for this part, the value will always be 3.

2. The second argument will be the �lename of the data �le.

3. The third argument will be the �lename of the output �le.

4. The fourth and �fth argument specify the x , y co-ordinate pair of the bottom-left corner of the
root node area, with the �rst value referring to the longitude ( x ) and the second value referring
to the latitude ( y ).

5. The sixth and seventh argument specify the top-right corner of the root node area, following
the same convention.

Your program should:

Just as in Assignment 1, read data from the data �le speci�ed in the second command line
argument. This may be stored unchanged from dict1 or dict2 in earlier implementations.

Construct a quadtree from the stored data.

Interpret and store the fourth, �fth, sixth and seventh arguments as long double values. The
strtold function should be used to achieve this.

Accept co-ordinate pairs from stdin, search the constructed quadtree for the point region
containing the co-ordinate pair and print all matching records to the output �le. You may
assume that all queries will be terminated by a new line. You should interpret these values as
double values.

In addition to outputting the record(s) to the output �le, the list of quadrant directions followed
from the root until the correct point region is found should be output to stdout .

Your approach should insert each footpath’s ( start_lon , start_lat ) and ( end_lon , end_lat )
pairs into the quadtree, allowing the footpath to be found from its start or end point.

Where multiple footpaths are present in the found point region, footpaths should be printed in
order of footpath_id .

The quadrants in relation to the dataset are:

(≤ longitude, < latitude) (≤ longitude, ≥ latitude) (> longitude, ≥ latitude)
(> longitude, < latitude) Your approach for the PR Quadtree must use long double values for representing regions. This 16-byte variant of float is more precise than the 8-byte double variant. No changes to your record implementation are needed, that is, you do not need to change your record struct's types to long double . (You should be using double though!) For testing, it may be convenient to create a �le of queries to be searched, one per line, and redirect the input from this �le. Use the UNIX operator < to redirect input from a �le. Example Execution An example execution of the program might be: make -B dict3 # ./dict3 stage datafile outputfile start_longitude start_latitude end_longitude end_latitude ./dict3 3 dataset_2.csv output.txt 144.969 -37.7975 144.971 -37.7955 followed by typing in the queries, line by line, and then ending input once all keys are entered or: make -B dict3 # ./dict3 stage datafile outputfile start_longitude start_latitude end_longitude end_latitude ./dict3 3 dataset_2.csv output.txt 144.969 -37.7975 144.971 -37.7955 < queryfile Example Output This is an example of what might be output to the output �le after three queries: 144.97056424489568 -37.796155887263744 --> footpath_id: 27665 || address: Palmerston Street between Rathdowne Street and || clue_sa: Carlton
144.96941668057087 -37.79606116572821
–> footpath_id: 27665 || address: Palmerston Street between Rathdowne Street and || clue_sa: Carlton
144.95538810397605 -37.80355555400948
–> footpath_id: 19458 || address: Queensberry Street between and || clue_sa: North Melbou

With the following output to stdout:

144.97056424489568 -37.796155887263744 –> NE SW NE NE
144.96941668057087 -37.79606116572821 –> NE SW NE NW
144.95538810397605 -37.80355555400948 –> SW NW SE

Stage 4 – Supporting Range Queries

In Stage 4, you will add the additional functionality to your quadtree to support range queries given
by (x, y) co-ordinate pairs.

Your Makefile should now also produce an executable program called dict4 . This program should
take seven command arguments, similar to Stage 3, with only the expected value of the �rst
argument changing:

1. The �rst argument will be the stage, for this part, the value will always be 4.

2. The second argument will be the �lename of the data �le.

3. The third argument will be the �lename of the output �le.

4. The fourth and �fth argument specify the x , y co-ordinate pair of the bottom-left corner of the

root node area, with the �rst value referring to the longitude ( x ) and the second value referring
to the latitude ( y ).

5. The sixth and seventh argument specify the top-right corner of the root node area, following
the same convention.

Your dict4 program should:

Just as Stage 3, read in the dataset, store it in a dictionary and construct a quadtree on the
stored data.

For Stage 4, you should accept sets of pairs of co-ordinate long double type values from stdin,
and e�ciently use the quadtree to �nd all footpaths which are within the bounds of the query.
You may assume that no blank lines will be present in the queries.

Output to stdout should include all directions searched in order, with each branch potentially
containing points within the query bounds fully explored before proceeding to the next
possible branch. Where multiple directions are possible, quadrants must be searched in the
order SW , NW , NE , SE . Each direction must be separated by a space. The de�nitions of each
quadrant are given in the table below.

Similar to Stage 3, where multiple footpaths are returned by the range query, these should be
sorted by footpath_id . Output each footpath only once for each query, even if both its “start”
and “end” points both occur in the searched region.

Example Execution

An example execution of the program might be:

make -B dict4
# ./dict4 stage datafile outputfile start_longitude start_latitude end_longitude end_latitude
./dict4 4 dataset_2.csv output.txt 144.968 -37.797 144.977 -37.79

followed by typing in the queries, line by line, and then ending input once all keys are entered or:

make -B dict4
# ./dict4 stage datafile outputfile start_longitude start_latitude end_longitude end_latitude
./dict4 4 dataset_2.csv output.txt 144.968 -37.797 144.977 -37.79 < queryfile Example Output This is an example of what might be output to the output �le after three queries: 144.968 -37.797 144.977 -37.79 --> footpath_id: 27665 || address: Palmerston Street between Rathdowne Street and || clue_sa: Carlton
–> footpath_id: 29996 || address: || clue_sa: Carlton || asset_type: Road Footway || deltaz: 0.46 || distance: 54.5
144.9678 -37.79741 144.97202 -37.79382
–> footpath_id: 27665 || address: Palmerston Street between Rathdowne Street and || clue_sa: Carlton
144.973 -37.795 144.976 -37.792
–> footpath_id: 29996 || address: || clue_sa: Carlton || asset_type: Road Footway || deltaz: 0.46 || distance: 54.5

With the following output to stdout:

144.968 -37.797 144.977 -37.79 –> SW SW SE NE SE
144.9678 -37.79741 144.97202 -37.79382 –> SW SW SE
144.973 -37.795 144.976 -37.792 –> NE SE

The dataset comes from the City of Melbourne Open Data website, which provides a variety of data
about Melbourne that you can explore and visualise online:

https://data.melbourne.vic.gov.au/

The dataset used in this project is a subset of the Footpath Steepness dataset combined with data
from the Small Areas for Census of Land Use and Employment (CLUE) dataset. This dataset has been
processed to simplify the geometric polygon data into additional attributes clue_sa , start lon ,
start lat , end lon , end lat , which can be used to approximate the footpath as a line on the

The processed dataset can be found in the Dataset Download slide. You aren’t expected to perform this
processing yourself.

The provided dataset has the following 19 �elds:

footpath_id – The row number for this footpath in the original full dataset. (integer)
address – A name describing the location of the footpath. (string)
clue_sa – The CLUE small area the footpath is in. (string)
asset_type – The name of the type of footpath. (string)
deltaz – The change in vertical distance along the footpath. (double)
distance – The length of the footpath (full geometry) in metres. (double)
grade1in – The percentage gradient of the footpath (full geometry). (double)
mcc_id – The id number identifying the footpath. (integer)
mccid_int – A second number identifying the road or intersection the footpath borders. (integer)
rlmax – The highest elevation on the footpath. (double)
rlmin – The lowest elevation on the footpath. (double)
segside – The side of the road which the footpath is on. (string)
statusid – The status of the footpath. (integer)
streetid – The ID of the first street in the Address. (integer)
street_group – The footpath_id of one of the footpaths connected to this footpath without a gap. (integer)
start_lat – The latitude (y) of the starting point representing the line approximation of the footpath. (double)
start_lon – The longitude (x) of the starting point representing the line approximation of the footpath. (doub

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com