Types
Chapters 7, 8
Data Types
§ Use of types:
§ Provide implicit context for operations
§ C: a + b – integer/floating point addition
§ Pascal: new p – allocate right size
§ C: new my_type()- right size, call right constructor
§ Type checking
§ Type equivalence: when two types are the same
§ Type compatibility: when a type can be used in a context § Type inference: deduce the type from components
§ Type clash: violation of type rules § Strongly typed language
§ prohibits any application of an operation to an object that is not intended to support that operation
2
Type Systems
§ Statically typed language
§ strongly typed
§ at compile time – good performance
§ C, C++, Java
§ C: more strongly typed with each new version
§ Dynamically typed language
§ at run time – ease of programming
§ Lisp, Smalltalk – strongly typed
§ Scripting: Python, Ruby – strongly typed
3
Type Systems
§ Polymorphism (polymorphous = multiform) § code works with multiple types
§ must have common characteristics
§ parametric polymorphism: take a type as parameter
§ explicit parametric polymorphism (generics, C++: templates) appears in statically typed languages
§ implemented at compile time
§ subtype polymorphism: code works with subtypes
§ object-oriented languages
§ combination (subtype + param. polym.) – container classes § List
4
Type Checking
§ Type equivalence
§ Structural equivalence
§ same components put together in the same way § C, Algol-68, Modula-3, ML
§ Name equivalence
§ lexical occurrence
§ each definition is a new type § Java, C#, Pascal
5
Type Checking
§ Structural equivalence: § format should not matter
type a, end;
R1 = record
b : integer
type R1 = record
a : integer;
b : integer;
end;
§ What about order? (most languages consider it equivalent)
type R3 = record
b : integer;
a : integer
end;
6
Type Checking
§ Structural equivalence: problem
type student = record
name, address : string
age : integer
type school = record
name, address : string
age : integer
x : student;
y : school;
…
x := y; — is this an error?
§ compiler says it’s okay
§ programmer most likely says it’s an error
7
Type Checking
§ Name equivalence
§ Distinct definitions mean distinct types
§ If the programmer takes the time to write two type definitions, then they are different types
§ Aliases
type new_type = old_type (* Algol syntax *)
typedef old_type new_type /* C syntax */
§ Are aliases the same or different types? § Different: strict name equivalence
§ Same: loose name equivalence
8
Type Checking
§ Type conversion (cast): explicit conversion
r = (float) n;
§ Type coercion: implicit conversion § very useful
§ weakens type security
§ dynamically typed languages: very important § statically typed languages: wide variety
§ Ada: very little coercion allowed § C++: the most extreme
9
Type Checking
§ Type compatibility
§ more important than equivalence
§ most of the time we need compatibility
§ assignment: RHS compatible with LHS
§ operation: operands types compatible with a common type that supports the operation
§ subroutine call: arguments types compatible with formal parameters types
10
Type Checking
§ Type inference
§ infer expression type from components § int + int => int
§ float + float => float
§ subranges cause complications
type Atype = 0..20; Btype = 10..20; var a : Atype; b : Btype; §Whatisthetypeof a+b ?
11
Type Checking
§ Type inference
§ declarations: type inferred from the context
§ C#: var
var i = 123; // equiv. to: int i = 123;
var map = new Dictionary
/* equiv. to: Dictionary
Dictionary
§ C++: auto
auto reduce =
[](list
for (auto e : L) { s = f(e, s); }
return s;
};
12
Type Checking
§ C++: decltype
§ match the type of an existing expression
template
…
A a; B b;
decltype(a + b) sum;
13
Parametric polymorphism
§ Implicit:
§ Scheme:
(define min (lambda (a b) (if (< a b) a b)))
§ applied to arguments of any type to which it can be applied § disadvantage: checked dynamically
§ Explicit: generics § C++: templates
§ checked statically
§ Generics in object-oriented programming § parametrize entire class
§ containers
§ C++ example
14
Parametric polymorphism
template
class queue {
item items[max_items];
int next_free, next_full, num_items;
public:
queue() : next_free(0),next_full(0),num_items(0){
}
bool enqueue(const item& it) { … }
bool dequeue(item* it) { … }
};
…
queue
queue
15
Arrays
§ Arrays
§ the most important composite data type
§ Homogenous data
§ index: discrete type
§ non-discrete type: associative array, dictionary, map § implemented using hash tables or search trees
§ Semantically:
§ mapping: index type → element type
§ dense – most positions non-zero
§ sparse arrays – stored using linked structures
16
Arrays
§ Slices
17
Arrays
§ Dimensions, Bounds, Allocation
§ static allocation:
§ array with lifetime the entire program § shape known at compile time
§ stack allocation:
§ array with lifetime inside subroutine § shape known at compile time
§ heap / stack allocation
§ dynamically allocated arrays
§ dope vector: holds shape information at run time
§ compiled languages need the number of dimensions
§ shape known at elaboration time: can be allocated on stack § shape changes during execution: allocated on heap
18
Arrays
§ Example: C dynamic local array
void square(int n, double M[n][n]) {
double T[n][n];
for (int i = 0; i < n; i++) { // copy prod. to T
for (int j = 0; j < n; j++) {
double s = 0;
for (int k = 0; k < n; k++)
s += M[i][k] * M[k][j];
T[i][j] = s;
} }
for (int i = 0; i < n; i++) {// copy T back to M
for (int j = 0; j < n; j++)
M[i][j] = T[i][j];
} }
19
Arrays
§ Shape known at elaboration time
§ can be allocated on stack § in the variable-size part
§ Example:
// C99:
void foo (int size) {
double M[size][size];
... }
20
Arrays
§ Memory layout
§ column major order – Fortran
§ row major order – everybody else
21
Arrays
§ Memory layout
§ Contiguous allocation
§ consecutive locations in memory: A[2,4], A[2,5] § consecutive rows adjacent in memory
§ Row pointers
§ consecutive rows anywhere in memory
§ extra memory for pointers
§ rows can have different lengths (ragged array)
§ can construct an array from existing rows without copying
§ C, C++, C# - allow both § Java – only row-pointer
22
Arrays
§ Example: C – array of strings
§ true two- dimensional array
23
Arrays
§ Example: C – array of strings
§ array of pointers
24
Arrays
§ Address calculation
A : array [L1..U1] of array [L2..U2] of array
[L3..U3] of elem_type;
S3 = size of elem_type S2 =(U3 −L3+1)×S3
S1 =(U2 −L2 +1)×S2
address of A[i,j,k] = addressofA
+ (i – L1) × S1 + (j – L2) × S2 + (k – L3) × S3
25
Arrays
§ Faster address calculation address of A[i,j,k]
= addressofA+(i–L1)×S1 +(j–L2)×S2 +(k–L3)×S3
§ Fewer operations
§C = [(L1 ×S1)+(L2 ×S2)+(L3 ×S3)] § C – known at compile time
address of A[i,j,k]
=addressofA + (i×S1)+(j×S2)+(k×S3) − C
26
Sets
§ Set
§ unordered collection of an arbitrary number of distinct
§ Implementation
§ characteristic array – one bit for each value (small base type)
§ efficient operations – bitwise op
§ general implementation: hash tables, trees, etc. § Python, Swift – built-in sets
§ Others use dictionaries, hashes, maps
X = set(['a', 'b', 'c', 'd']) # set constructor
values of a common type
Y = {'c', 'd', 'e', ‘f’} U = X| Y
I = X& Y
D = X- Y
O = X^ Y 'c' in I
# set literal
# union
# intersection
# difference
# sim. diff.
# membership
27
Pointers and Recursive Types
§ Pointer
§ a variable whose value is a reference to some object
§ not needed with a reference model of variables § needed with a value model of variables
§ efficient access to complicated objects
§ Recursive type
§ objects contain references to other objects § can create dynamic data structures
§ Pointer ≠ address
§ pointer = high-level concept; reference to object § address = low-level concept; location in memory § usually, pointers are implemented as addresses
28
Pointers and Recursive Types
§ Reference model example: Tree in Scheme – two types of objects: (1) cons cells (2) atoms
'(#\R (#\X ()()) (#\Y (#\Z ()()) (#\W ()())))
29
Pointers and Recursive Types
§ Value model example: Tree in C struct chr_tree {
struct chr_tree *left, *right; char val;
};
my_ptr = malloc(sizeof(struct chr_tree));
§ C++, Java, C# -- type safe my_ptr = new chr_tree(arg_list);
30
Pointers and Recursive Types
§ Dangling reference
§ live pointer that no longer points to a valid object
§ Example: caused by local variable after subroutine return:
int i = 3;
int *p = &i;
...
void foo(){ int n = 5; p = &n; } ...
cout << *p; // prints 3 foo();
...
cout << *p; // undefined behavior: n is no
longer live
31
Pointers and Recursive Types
§ Dangling reference
§ Example: caused by manual deallocation:
int *p = new int;
*p = 3;
...
cout << *p; // prints 3 delete p;
...
cout << *p; // undefined behavior: *p has been
reclaimed
§ Problem: a dangling reference can write to memory that is part of a different object; it may even interfere with bookkeeping, corrupting the stack or heap
32
Pointers and Recursive Types
§ Garbage collection
§ automatic reclamation of memory
§ slower than manual (delete)
§ difficult to implement
§ eliminates the need to check for dangling references § very convenient for programmers
§ essential for functional languages
§ increasingly popular in imperative languages
§ Java, C#
33
Pointers and Recursive Types
§ Garbage collection
§ Reference counts
§ object no longer useful when no pointers to it exist § store reference count for each object
§ initially set to 1
§ update when assigning pointers
§ update on subroutine return
§ when 0, reclaim object
§ recursive reclamation – initialize pointers to null to avoid following garbage addresses
34
Pointers and Recursive Types
§ Garbage collection
§ Reference counts
§ count ≠ 0 does not necessarily mean useful (circular lists)
35
Pointers and Recursive Types
§ Garbage collection
§ Smart pointers – C++
§ unique_ptr – one object only
§ shared_ptr – implements reference counts
§ weak_ptr – does not affect counts; for, e.g., circular structures
36
Pointers and Recursive Types
§ Garbage collection § Tracing collection
§ object useful if reachable via chain of valid pointers from outside the heap
§ Mark-and-sweep
(1) mark entire heap as “useless”
(2) staring from outside heap, recursively mark as “useful” (3) move “useless” block from heap to free list
§ Step (2) requires a potentially very large stack § Without stack: pointer reversal (next slide)
§ Stop-and-copy: defragmentation
§ use half of heap; copy useful data compactly to the other one
37
38
Lists
§ List: empty or (head + tail)
§ essential in functional and logic programming (recursive) § used also in imperative languages
§ Homogeneous (same type): ML
§ Heterogeneous: Scheme
a
c
b
C
C
C
A
a
A
b
A
c
39
Lists
§ Scheme:
§ '(...) prevents evaluation; also (quote (...))
(+12) ⟹ 3 '(+12) ⟹ '(+12)
(cons 'a '(b))
(car '(a b))
(car '())
(cdr '(a b c))
(cdr '(a))
(cdr '())
(append'(ab)'(cd)) ⟹ '(abcd)
⟹ '(a b) ⟹ a
⟹ error ⟹ '(b c) ⟹ '()
⟹ error
40
Lists
§ List comprehension
§ adapted from traditional math set notation:
{i × i | i ∈ {1, . . . , 10} ∧ i mod 2 = 1} § Example: Python
[i*i for i in range(1, 10) if i % 2 == 1] ⟹ [1, 9, 25, 49, 81]
41