RefLang: a language about references/pointers
March 11, 2021
Side Effect
Pure functional programs have no side effects: given the same input a functional program would produce the same output.
Side effect: change the state of the program besides its output, i.e.,it can potentially effect other functions and programs.
Examples:
Reading or writing memory locations
Printing on console
File read and file write
Throwing exceptions
Sending packets on network,
Acquiring mutual exclusion locks
Memory management
Types of memory where a user program stores their data during execution of the program:
data section/static: allocated when the execution starts stack: local variables, function invocation
heap: allocation and deallocation by user programs
memory allocation
memory deallocation
memory access: dereference (get values from memory location via
pointers), reference (associate pointer with memory location)
memory operation
Heap and references
Heap: an abstraction representing area in the memory reserved for dynamic memory allocation
References: locations in the heap
Decisions for Language Designers – Heap
Heap size is finite, programming languages adopt strategies to remove unused portions of memory so that new memory can be allocated.
manual memory management: the language provides a feature (e.g. free in C/C++) to deallocate memory and the programmer is responsible for inserting memory deallocation at appropriate locations in their programs.
automatic memory management: the language does not provide explicit feature for deallocation. Rather, the language implementation is responsible for reclaiming ununsed memory (Java, C#) – garbage collection
Decisions for Language Designers – Heap
How individual memory locations in the heap are treated:
untyped heap: the type of value stored at a memory location is not fixed, can be changed during program execution
typed heap: each memory location has an associated type and it can only contain values of that type, the type of value stored at a memory location doesn’t change during the program’s execution
Decisions for Language Designers – Reference (pointers)
1. Explicit references: references are program objects available to the programmer
2. Implicit references: references only available to implementation of the language
3. Reference arithmetic: references are integers and thus we can apply arithmetic operations
4. Deref and assignment only: get the value stored at that location in the heap, assignment can change the value stored at that location in the heap
Decisions for Language Designers – Examples
C: manual memory management, explicit reference, untyped heap, reference arithmetic
Java: automatic memory management, deref and assignment only, untyped heap, implicit reference
Reflang: manual memory management, deref and assignment, untyped heap, explicit references
RefLang
ref, free, deref, set!
Allocate a memory location
Free previously allocated memory location
Dereference a location reference
Assign a new value to an existing memory location
Examples: $(ref 1) loc: 0
Return: the location at which memory was allocated (next available memory location)
Side effect: assign value 1 to the allocated memory location
Reflang Expressions
ref: This expression evaluates its subexpression to a value, allocates a new memory location to hold this value, and returns a reference value that encapsulates information about the newly allocated memory location.
$ (define loc1 (ref 12))
// stores value 12 at some location in memory, creates a reference value to encapsulate (and remember) that location, and stores that reference value in variable loc1
$ (define loc2 (ref 45))
$ loc1 // check the reference value stored in variable loc1 loc:0
$ loc2 loc:1
Reflang Expressions
deref: This expression evaluates its subexpression to a value. If that evaluation evaluates to a reference value, and that reference value encapsulates a location l, then it retrieves the value stored in Heap at location l.
$ (deref loc1) // gives the value stored at loc1 12
$ (deref loc2) // gives the value stored at loc2 45
$ (+ (deref loc1) (deref loc2)) //access both values and adds them 57
$ (deref 12) // throws Dynamic error
Reflang Expressions
assign: This expression is used to change the value stored on some location in Heap. It will return the newly assigned value.
$ (set! loc1 23) //previous value 12 is overwritten by 23 23
$ (set! loc2 24) //previous value 45 is overwritten by 24 24
$ loc1 // loc1 still has address 0 but value has changed now loc:0
$ loc2 // loc2 still has address 1 but value has changed now loc:1
$ (+ (deref loc1) (deref loc2)) // different value different summation value
47
Reflang Expressions
free: This expression is used to deallocate the reference stored in Heap. $ (free loc1) // deallocates the memory address 0
$ loc1 // variable loc1 still points to same location loc:0
$ (deref loc1) // dereference loc1
Error:null // invalid because memory location has been freed $ (free loc2) // deallocates the memory address stored in loc2
$ (deref loc2) // dereference loc2
Error:null // invalid because memory location has been freed
RefLang: More Examples
$ (free (ref 1)) // delocate the memory location where 1 is stored $ (deref (ref 1)) // deref a memory location defined by ref 1
$ (let ((loc (ref 1))) (deref loc)) $ (let ((loc (ref 1))) (set! loc 2))
$(let ((loc (ref 10))) (let ((loc2 loc)) (+ (set! loc 1) (deref loc2)))) 2
$ (let ((loc (ref 10))) (let ((loc2 loc)) (+ (deref loc2) (set! loc 1)))) 11
Reflang: Grammar
RefLang programming exercises
Write some RefLang programs
RefLang and FuncLang Programming
RefLang and FuncLang Programming
Example scripts:
$ (getFst head) 1
$ (getSnd head) loc: 0
RefLang and FuncLang Programming
RefLang and FuncLang Programming
Reflang Interpreter
Semantics:
values
abstractions added? env, heap … operational semantic rules
Reflang: Extending Values
RefVal ̸= NumVal
prevent from accessing arbitrary memory location no arithmetics
extra metadata
RefLang: Heap Abstraction
Heap : RefVal → Value
In the RefLang interpreter code, heap implementation helps you update the heap
And, evaluator implementation (operational semantics) is about how to evaluate the expressions making use of the heap
Reflang Operational Semantics
valuepenvh=valueeenvh
In an environment env and a heap h, the value of a program is the value of its component expression e in the same environment env and the same heap h.
Expressions that do not affect heap directly or indirectly:
Constant expression: value e env h = (NumVal n) h, where n is a
Number, env is an environment, h is a heap
Variable expression – look up names for values: value (VarExp var) env h = get(env, var) h
Expressions that indirectly affect heap through their subexpressions
Expressions that directly affect heap
Reflang: Expressions that indirectly affect heap
the order in which side effects from one sub-expression are visible to the next sub-expression
Add/subtraction/multiplication/division expression:
a left-to-right order is used in the relation above for side-effect visibility
Reflang: Expressions that directly affect heap
ref, set!, free
deref: read from memory only
Reflang: RefExp
The rule says, to compute the value of RefRxp e under env and current heap location and update the heap to h2
If the value of e under same env and same h returns value v0 in h1
and heap is computed using the updated heap h1 union RefVal l with
the mapping to value v0
N.B. heap is mapping between the reference value to actual value that is stored in that location space.
Reflang: AssignExp
The rule says, to compute the value of AssignExp e0 ei under env under current heap location, (it directly affects heap) the result is the value v0 and updated heap is h3
Below order of subexpression evaluation is important:
If value of e1 is evaluated under env and h and you get a value v0 and updated heap h1
and then value of e0 is evaluated under env and h1 and it evaluates to a RefVal l and modify the heap to h2
To compute h3: add the pair (l → v0) i.e., store v0 in l and delete previously stored value (the underscore) from l in h2.
Reflang: FreeExp
The rule says, to compute the value of FreeExp e under env and current heap location h, the result is unit value and updated heap is h2
If value of e under env and h is evaluated and result is a RefVal l with updated h1
Note, if l is not under the domain of h1, you can throw dynamic error
To compute h2: h2 becomes h1 and mapping from l to some value
(represented by underscore) is deleted
Reflang: DerefExp
The rule says, to compute DerefExp e under env and heap h (it directly affects heap), the result will return v in updated h1
As evaluation of e under env and heap h may modify the value of heap, hence it is updated to h1 and l is a RefVal
Here, mapping of l to v belongs to the heap h1
RefLang Implementation: Heap and Evaluator
See RefLang interpreter Code