RefLang: a language about references/pointers
RefLang: a language about references/pointers
March 11, 2021
Side Effect
I Pure functional programs have no side effects: given the same input
a functional program would produce the same output.
I Side effect: change the state of the program besides its output,
i.e.,it can potentially effect other functions and programs.
I Examples:
I Reading or writing memory locations
I Printing on console
I File read and file write
I Throwing exceptions
I Sending packets on network,
I Acquiring mutual exclusion locks
Memory management
Types of memory where a user program stores their data during
execution of the program:
I data section/static: allocated when the execution starts
I stack: local variables, function invocation
I heap: allocation and deallocation by user programs
I memory allocation
I memory deallocation
I memory access: dereference (get values from memory location via
pointers), reference (associate pointer with memory location)
I memory operation
Heap and references
I Heap: an abstraction representing area in the memory reserved for
dynamic memory allocation
I References: locations in the heap
Decisions for Language Designers – Heap
Heap size is finite, programming languages adopt strategies to remove
unused portions of memory so that new memory can be allocated.
I manual memory management: the language provides a feature (e.g.
free in C/C++) to deallocate memory and the programmer is
responsible for inserting memory deallocation at appropriate
locations in their programs.
I automatic memory management: the language does not provide
explicit feature for deallocation. Rather, the language
implementation is responsible for reclaiming ununsed memory (Java,
C#) – garbage collection
Decisions for Language Designers – Heap
How individual memory locations in the heap are treated:
I untyped heap: the type of value stored at a memory location is not
fixed, can be changed during program execution
I typed heap: each memory location has an associated type and it can
only contain values of that type, the type of value stored at a
memory location doesn’t change during the program’s execution
Decisions for Language Designers – Reference (pointers)
1. Explicit references: references are program objects available to the
programmer
2. Implicit references: references only available to implementation of
the language
3. Reference arithmetic: references are integers and thus we can apply
arithmetic operations
4. Deref and assignment only: get the value stored at that location in
the heap, assignment can change the value stored at that location in
the heap
Decisions for Language Designers – Examples
I C: manual memory management, explicit reference, untyped heap,
reference arithmetic
I Java: automatic memory management, deref and assignment only,
untyped heap, implicit reference
I Reflang: manual memory management, deref and assignment,
untyped heap, explicit references
RefLang
ref, free, deref, set!
I Allocate a memory location
I Free previously allocated memory location
I Dereference a location reference
I Assign a new value to an existing memory location
Examples:
$(ref 1)
loc: 0
I Return: the location at which memory was allocated (next available
memory location)
I Side effect: assign value 1 to the allocated memory location
Reflang Expressions
ref: This expression evaluates its subexpression to a value, allocates a new
memory location to hold this value, and returns a reference value that
encapsulates information about the newly allocated memory location.
$ (define loc1 (ref 12))
// stores value 12 at some location in memory, creates a reference value
to encapsulate (and remember) that location, and stores that reference
value in variable loc1
$ (define loc2 (ref 45))
$ loc1 // check the reference value stored in variable loc1
loc:0
$ loc2
loc:1
Reflang Expressions
deref: This expression evaluates its subexpression to a value. If that
evaluation evaluates to a reference value, and that reference value
encapsulates a location l, then it retrieves the value stored in Heap at
location l.
$ (deref loc1) // gives the value stored at loc1
12
$ (deref loc2) // gives the value stored at loc2
45
$ (+ (deref loc1) (deref loc2)) //access both values and adds them
57
$ (deref 12) // throws Dynamic error
Reflang Expressions
assign: This expression is used to change the value stored on some
location in Heap. It will return the newly assigned value.
$ (set! loc1 23) //previous value 12 is overwritten by 23
23
$ (set! loc2 24) //previous value 45 is overwritten by 24
24
$ loc1 // loc1 still has address 0 but value has changed now
loc:0
$ loc2 // loc2 still has address 1 but value has changed now
loc:1
$ (+ (deref loc1) (deref loc2)) // different value different summation
value
47
Reflang Expressions
free: This expression is used to deallocate the reference stored in Heap.
$ (free loc1) // deallocates the memory address 0
$ loc1 // variable loc1 still points to same location loc:0
$ (deref loc1) // dereference loc1
Error:null // invalid because memory location has been freed
$ (free loc2) // deallocates the memory address stored in loc2
$ (deref loc2) // dereference loc2
Error:null // invalid because memory location has been freed
RefLang: More Examples
$ (free (ref 1)) // delocate the memory location where 1 is stored
$ (deref (ref 1)) // deref a memory location defined by ref 1
$ (let ((loc (ref 1))) (deref loc))
$ (let ((loc (ref 1))) (set! loc 2))
$(let ((loc (ref 10))) (let ((loc2 loc)) (+ (set! loc 1) (deref loc2))))
2
$ (let ((loc (ref 10))) (let ((loc2 loc)) (+ (deref loc2) (set! loc 1))))
11
Reflang: Grammar
RefLang programming exercises
Write some RefLang programs
RefLang and FuncLang Programming
RefLang and FuncLang Programming
Example scripts:
$ (getFst head)
1
$ (getSnd head)
loc: 0
RefLang and FuncLang Programming
RefLang and FuncLang Programming
Reflang Interpreter
Semantics:
I values
I abstractions added? env, heap …
I operational semantic rules
Reflang: Extending Values
I RefVal 6= NumVal
I prevent from accessing arbitrary memory location
I no arithmetics
I extra metadata
RefLang: Heap Abstraction
Heap : RefVal → Value
I In the RefLang interpreter code, heap implementation helps you
update the heap
I And, evaluator implementation (operational semantics) is about how
to evaluate the expressions making use of the heap
Reflang Operational Semantics
I value p env h = value e env h
In an environment env and a heap h, the value of a program is the
value of its component expression e in the same environment env
and the same heap h.
I Expressions that do not affect heap directly or indirectly:
I Constant expression: value e env h = (NumVal n) h, where n is a
Number, env is an environment, h is a heap
I Variable expression – look up names for values:
value (VarExp var) env h = get(env, var) h
I Expressions that indirectly affect heap through their subexpressions
I Expressions that directly affect heap
Reflang: Expressions that indirectly affect heap
I the order in which side effects from one sub-expression are visible to
the next sub-expression
I Add/subtraction/multiplication/division expression:
a left-to-right order is used in the relation above for side-effect
visibility
Reflang: Expressions that directly affect heap
I ref, set!, free
I deref: read from memory only
Reflang: RefExp
I The rule says, to compute the value of RefRxp e under env and
current heap location and update the heap to h2
I If the value of e under same env and same h returns value v0 in h1
I and heap is computed using the updated heap h1 union RefVal l with
the mapping to value v0
N.B. heap is mapping between the reference value to actual value
that is stored in that location space.
Reflang: AssignExp
I The rule says, to compute the value of AssignExp e0 ei under env
under current heap location, (it directly affects heap) the result is
the value v0 and updated heap is h3
Below order of subexpression evaluation is important:
I If value of e1 is evaluated under env and h and you get a value v0
and updated heap h1
I and then value of e0 is evaluated under env and h1 and it evaluates
to a RefVal l and modify the heap to h2
I To compute h3: add the pair (l → v0) i.e., store v0 in l and delete
previously stored value (the underscore) from l in h2.
Reflang: FreeExp
I The rule says, to compute the value of FreeExp e under env and
current heap location h, the result is unit value and updated heap is
h2
I If value of e under env and h is evaluated and result is a RefVal l
with updated h1
I Note, if l is not under the domain of h1, you can throw dynamic error
I To compute h2: h2 becomes h1 and mapping from l to some value
(represented by underscore) is deleted
Reflang: DerefExp
I The rule says, to compute DerefExp e under env and heap h (it
directly affects heap), the result will return v in updated h1
I As evaluation of e under env and heap h may modify the value of
heap, hence it is updated to h1 and l is a RefVal
I Here, mapping of l to v belongs to the heap h1
RefLang Implementation: Heap and Evaluator
See RefLang interpreter Code