COMP2300/6300
Computer Organisation and Program Execution
Data structures
Dr Charles 1, 2022
Copyright By PowCoder代写 加微信 powcoder
Week 7: Data Structures
why do we need data structures? arrays
memory allocation
Admin Time
Assignment 1 results
Mid-semester Exam
Psyche up for second half of the course
Why do we need data structures? to structure our data (um, ok…)
to minimise bookkeeping to make our lives easier to play nicely with others
Arrays are for collections of homogeneous data Wikipedia has more (as usual)
You might need an array if youʼve got
a string of ASCII characters (remember Shoutface?)
an audio file (e.g. an array of 16-bit signed values to “send” to an audio output)
a collection of red-green-blue (3 x 8-bit unsigned) values which represent a bitmap image
Essential information for an array
An array is just a bunch of things, but here are the key things to figure out:
what are the things?
where are the things?
how big are the things? how many things are there?
Can you fit an array into a register?
probably not
need to store it in memory
Addressing
Array addressing is about reading and writing specific elements of the array also called indexing (Latin: indicō “point out, show”)
// the index
// ↓
int x = the_things[0]; // the 1st element
int y = the_things[4]; // the 5th element
In assembly code
itʼs just loads and stores (like youʼve been doing all along)
the_things:
.word 2, 3, 0, 0, 6, 3, 0, 0
get_elements:
ldr r0, =the_things
ldr r1, [r0] @ the 1st element
ldr r2, [r0, 16] @ the 5th element
whatʼs the big deal? why are unaligned loads/stores a problem?
hint: check out the Cortex M4 Technical Reference Manual section on load store timings
you can put anything in an array, not just words
abstract data type vs
data structure
Working with arrays
they donʼt fit in registers, so we have to operate on them one element at a time we need loops!
Array sum example
given an array of (word-size) integers, find the sum of the elements from from_index to to_index
int sum(int array[], int from_index, int to_index){
int acc = 0; // the “accumulator” variable
for(int i = from_index; i <= to_index; i++){
acc += array[i];
return acc; }
The array in memory
.word 2, 3, 0, 0, 6, 3, 0, 0
back-of-the-envelope maths: the sum is 14 (0xE)
Array sum setup
For the following examples, assume: the base address of the array is in r0
the from (start) index is in r1 (0 in the setup code below) the to (end) index is in r2 (7 in the setup code below)
ldr r0, =array @ base address
mov r1, 0 @ from_index
mov r2, 7 @ to_index
Array sum #1
mov r3, 0 @ "accumulator" register
mov r4, 4 @ element size
array_sum:
mul r5, r1, r4 @ calculate offset
ldr r6, [r0, r5] @ load from offset
add r3, r6
cmp r1, r2
ble array_sum
mov r0, r3
@ update accumulator
@ increment index
@ keep looping?
2 instructions in setup, 6 in loop
Array sum #2
mov r3, 0 @ acc
array_sum:
@ load with shifted index register
ldr r6, [r0, r1, lsl 2]
add r3, r6
cmp r1, r2
ble array_sum
end_array_sum:
mov r0, r3
@ update running total
@ increment index
@ keep looping?
1 instruction in setup, 5 in loop, no need to explicitly calculate the o set (but size must be power of 2)
Array sum #3
mov r3, 0 @ acc
lsl r1, r1, 2 @ change index -> offset
lsl r2, r2, 2 @ change index -> offset
array_sum:
ldr r6, [r0, r1]
add r3, r6
cmp r1, r2
ble array_sum
mov r0, r3
@ update running total
@ increment index
3 instruction in setup, 5 in loop
uses byte o set instead of element index
any more ideas for how we could speed this up?
Array sum #4
mov r3, 0 @ acc
lsl r1, r1, 2 @ change index -> offset
add r1, r0, r1 @ address of from element
lsl r2, r2, 2 @ change index -> offset
add r2, r0, r2 @ address of to element
array_sum:
ldr r6, [r1], 4 @ load & post-index r1
add r3, r6 @ update running total
cmp r1, r2
ble array_sum
mov r0, r3
5 instructions in setup, only 4 in loop, note the load with post-index
On optimisation…
loops are o en the “hot” part of a program, therefore worth optimising
optimising compilers will do some weird things to get the most optimised code in general, write simple code first, and optimise later (if necessary)
donʼt forget about endianness!
remember, ldr / str still just loads words, not elements (itʼs convenient if your elements are word-sized, but watch out if theyʼre not)
no bounds checking so far!
we werenʼt careful about the AAPCS in our array sum examples earlier
Memory allocation for arrays
static: memory is set aside at compile-time (e.g. in the .data section) pro: allocation is already done when the program starts
con: need to know the size of the array in advance
dynamic: memory is made available (e.g. on the stack) at run-time
pro: can pick the best size at runtime, can re-size con: takes time (while program is running)
weʼll return to this later…
Better(?) array data structures
The arrays weʼve looked at so far are pretty bare-bones; containing just the raw data (at runtime, anyway)
there are several improvements
null-termination
store array size alongside the data can we make a resizeable array?
Where are the array(s)?
Records are for collections of heterogeneous data Wikipedia says:
A record is a collection of fields, possibly of di erent data types, typically in fixed number and sequence. The fields of a record may also be called members, … or elements, though these risk confusion with the elements of a collection.
Overloaded names!
records might also be referred to as structs, tuples, objects
(although those terms can also mean di erent things) context matters
Simple examples
Suppose we are making character data for a computer game:
mana (word)
name (16-byte, null-terminated ASCII array)
Or a basic synthesizer
frequency (word) phase (word) amplitude (word) type (halfword)
Field ordering
whatʼs the di erence here?
the address of the first element is all you need
Records by request
Composite data structures
Things really get interesting and useful when we combine things
records inside records (nested records)
records of arrays arrays of records
Are these things OO objects? No. They have the variables but not the methods.
Memory Allocation
what does it mean to allocate memory for an array?
(silly memory allocate)
Letʼs make a silly memory allocator
Describing the smalloc function
At a minimum, our memory allocator function must
take (as an input parameter) the number of bytes to allocate
return a memory address which points to that many bytes of free memory
there are many other things it could do, but this will do for now
Silly Malloc
@ r0 param is number of bytes to allocate returns a memory addr
.type smalloc, %function
push {r0} @ just save r0 for a sec
ldr r3, =smalloc_value
ldr r0, [r3] @ get latest free memory address, this is the re
pop {r1} @ get back parameter
add r1, r1, r0 @ new “smalloc_value”, just move it along by r
str r1, [r3] @ save new smalloc_value
.size smalloc, . – smalloc
smalloc_value:
.word 0x20000004 @ 4 because it starts AFTER this word!
Real malloc
real operating systems provide a malloc function for dynamically allocating memory
which sortof works like this, but is much better:
allows you to “release” memory a er youʼre done with it keeps track of the memory allocations using metadata
(tries to) gracefully fail when it canʼt give you enough memory
although itʼs implementation dependent
Data structures and argument passing
how would you pass an array or a record as an argument to a function? how about as a return value?
it’s alive
Records as function parameters
imagine an is_alive function which
takes one parameter: a hearthpebble character data structure
returns a (word-sized) 0 if the character is dead, and 1 if the character is alive
zoltan_the_magnificent:
@ initialise hp and mana
.word 100, 200 @ more mana because wizard
Itʼs alive #1: pass character in registers
movs r0, r0 @ cheeky trick!
ble end_is_alive
end_is_alive:
ldr r2, =zoltan_the_magnificent
ldr r0, [r2] @ hp
ldr r1, [r2, 4] @ mana
bl is_alive
Itʼs alive #2: pass character on stack
pops {r0,r1} @ read args off stack
ble end_is_alive
end_is_alive:
@ call (including copy record onto the stack)
@ for larger records this might require a loop
ldr r2, =zoltan_the_magnificent
ldr r0, [r2]
ldr r1, [r2, 4]
push {r0,r1}
bl is_alive
Itʼs alive #3: pass character by reference
@ load hp using the address argument
@ no need to load mana
ldr r1, [r0]
movs r1, r1
ble end_is_alive
end_is_alive:
@ call (pass only address of “zoltan” record)
ldr r0, =zardok
bl is_alive
the records in these examples are all still small enough that they could be passed in registers, but large ones canʼt be
be aware of stack discipline
pass by reference: no copying, but the caller can mess with the “source” data
Further reading
Inside memory management: The choices, tradeo s, and implementations of dynamic allocation
A look at how malloc works on mac Plantz: Chapter 15 – Data Structures
Questions?
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com