PowerPoint Presentation
© M. Winter
COSC 2P05 – Programming languages
4.*
An enumeration type is one in which all of the possible values are listed in the definition. The enumeration constants are typically implicitly assigned integer values.
enum days {Mon, Tue, Wed, Thu, Fri, Sat, Sun};
Design issues:
Is an enumeration constant allowed to appear in more that one type definition?
Are enumeration values coerced to integer?
Are any other type coerced to an enumeration type?
Examples in C++:
myDay = Mon; myDay = 4; Not allowed
myDay++;
Enumeration Types
allowed
© M. Winter
COSC 2P05 – Programming languages
4.*
C++
A constant can only appear in one enumeration type.
Enumeration values are coerced to int.
No type is coerced to an enumeration type.
Ada
Constants are allowed to appear in more than one enumeration type.
No type is coercion either way.
Java
Enumeration types are classes, i.e., they can have instance data fields, constructors, and methods.
No type is coercion either way.
C#
Like C++ but no coercion either way.
Enumeration types increase readability and reliability.
© M. Winter
COSC 2P05 – Programming languages
4.*
A subrange type is a contiguous subsequence of an ordinal type.
Ada
Subtypes are not new types; rather, they are new names for possibly restricted versions of existing types.
type Days is (Mon, Tue, Wed, Thu, Fri, Sat, Sun);
subtype Weekdays is Days range Mon…Fri;
Day1 : Days;
Day2 : Weekdays;
…
Day2 := Day1; potential run-time error
Subrange types increase readability.
Subrange Types
© M. Winter
COSC 2P05 – Programming languages
4.*
An array is a homogeneous aggregate of data elements in which an individual element is identified by its position in the aggregate.
In many languages all elements of an array are required to be of the same type.
Design issues:
What types are legal for subscripts?
Are subscripting expressions in element references range checked?
When are subscript ranges bound?
When does array allocation take place?
Are jagged or rectangular multidimensional arrays allowed?
Can arrays be initialized when they have storage allocated?
What kind of slices are allowed?
Array Types
© M. Winter
COSC 2P05 – Programming languages
4.*
Arrays are sometimes called finite mappings because they map an index to a value.
Accessing elements a[i] vs a(i)
Most languages other than Fortran and Ada use square brackets.
Type of subscripts
Often a subrange of integers, sometimes of the form 0…n
Ada allows any ordinal type as subscripts
Most contemporary languages do range checks at run-time. Perl returns undef but no error is reported.
Perl allows negative subscripts, in which case the subscript value is an offset from the end of the array.
Arrays and Indices
© M. Winter
COSC 2P05 – Programming languages
4.*
There are 5 categories of arrays, based on the binding to subscript ranges, the binding to storage, and from where the storage is allocated.
A static array is one in which the subscript ranges are statically bound and storage allocation is static.
A fixed stack-dynamic array is one in which the subscript ranges are statically bound, but the allocation is done at declaration elaboration time during execution.
A stack-dynamic array is one in which both the subscript ranges and the storage allocation are dynamically bound at elaboration time.
A fixed heap-dynamic array is similar to a fixed stack-dynamic array, in that the subscript ranges and the storage binding are both fixed after storage is allocated. However, both bindings are done during usingthe heap.
A heap-dynamic array is one in which the binding of subscript ranges and storage allocation is dynamic and can change any number of times during the array’s lifetime.
Array Categories
© M. Winter
COSC 2P05 – Programming languages
4.*
Arrays in C and C++ declared using the static modifier are static.
Arrays in C and C++ declared without the static modifier are fixed stack-dynamic.
Arrays in Ada can be stack dynamic.
In C and C++ arrays can be treated as pointers. Using malloc and free creates fixed heap-dynamic arrays.
In Java all arrays are fixed heap-dynamic.
Perl has heap-dynamic arrays.
© M. Winter
COSC 2P05 – Programming languages
4.*
Examples:
Fortran
Integer, Dimension(3) :: List = (/0, 5, 5/)
C, C++, Java, and C#
int list[] = {4, 5, 7, 83};
Ada
Bunch : array (1..5) of Integer :=
(1 => 17, 3 => 34, others => 0);
Array Initialization
© M. Winter
COSC 2P05 – Programming languages
4.*
C based languages do not provide any array operations.
except library methods
Ada allow array assignment and catenation.
Fortran includes a number of elemental array operations. These operations perform a certain operation on the elements pairwise.
APL is the most powerful array-processing language ever.
+ overloaded for numbers, vectors, and matrices
V reverses the elements of V
M reverses the columns of M
M reverses the rows of M
M transposes M (its rows become its columns and vice versa)
÷M inverts M
Many more…
Array Operations
© M. Winter
COSC 2P05 – Programming languages
4.*
A rectangular array is a multidimensional array in which the rows have the same length, and the columns have the same length.
A jagged array is one in which the length of the rows need not to be the same.
C, C++, and Java support jagged array but not rectangular arrays.
Typical access: myArray[3][7]
Fortran, Ada, and C# support rectangular arrays.
Typical access: myArray[3,7]
Rectangular and Jagged Arrays
© M. Winter
COSC 2P05 – Programming languages
4.*
A slice of an array is some substructure of that array.
Examples:
Python
vector[3:6] (subrange)
mat[1] (one row)
Fortran (like Python) plus
Mat(:,2) (one column)
Slices
© M. Winter
COSC 2P05 – Programming languages
4.*
Implementation of Array Types
address(list[k]) = base_address +
(k – lower_bound) * element_size
base_address
lower_bound
element_size
© M. Winter
COSC 2P05 – Programming languages
4.*
address(a[i, j])
= base_address
+ (((i-row_lb) * n) + (j-col_lb)) * element_size
= base_address
– (((row_lb * n) + col_lb) * element_size)
+ ((i * n) + j) * element_size
row_lb, n
base_address
col_lb
element_size
© M. Winter
COSC 2P05 – Programming languages
4.*
A record is an aggregate of data elements in which individual elements are identified by names and access through offsets from the beginning of the structure.
Records have been part of all of the most popular programming languages, except pre-90 versions of Fortran, since the early 1960s, when they were introduced by COBOL.
In C, C++, and C#, records are supported with the struct data type.
Design issues:
What is the syntactic form of references to fields?
Are elliptical references allowed?
Record Types
© M. Winter
COSC 2P05 – Programming languages
4.*
COBOL
01 EMPLOYEE-RECORD.
02 EMPLOYEE-NAME.
05 FIRST PICTURE IS X(20).
05 Middle PICTURE IS X(10).
05 LAST PICTURE IS X(20).
02 HOURLY-RATE PICTURE IS 99V99.
Java, C++
In Java and C#, records can be defined as data classes, with nested records defined as nested classes. Data members of such classes serve as the record fields.
Definition of Records
© M. Winter
COSC 2P05 – Programming languages
4.*
Ada
type Employee_Name_Type is record
First : String(1..20);
Middle : String(1..10);
Last : String(1..20);
end record;
type Employee_Record_Type is record
Employee_Name : Employee_Name_Type;
Hourly_Rate : Float;
end record;
© M. Winter
COSC 2P05 – Programming languages
4.*
COBOL field references have the form
field_name OF record_name_1 OF . . . OF record_name_n
MIDDLE OF EMPLOYEE-NAME OF EMPLOYEE-RECORD
COBOL allows elliptical references to record fields. In an elliptical reference, the field is named, but any or all of the enclosing record names can be omitted, as long as the resulting reference is unambiguous in the referencing environment.
FIRST, FIRST OF EMPLOYEE-NAME, FIRST OF EMPLOYEE-RECORD
Ada, C, C++
Employee_Record.Employee_Name.Middle
References to Record Fields
© M. Winter
COSC 2P05 – Programming languages
4.*
Implementation of Record Types
© M. Winter
COSC 2P05 – Programming languages
4.*
A union is a type whose variables may store different type values at different times during program execution.
Design issues:
Are union types typed checked?
In C and C++, the union construct is used to specify union structures. The unions in these languages are called free unions, because programmers are allowed complete freedom from type checking in their use.
No language support for type checking.
Union Types
© M. Winter
COSC 2P05 – Programming languages
4.*
union flexType {
int intEl;
float floatEl;
};
union flexType el1;
float x;
…
el1.intEl = 27;
x = el1.floatEl; not typed checked
Type checking of unions requires that each union construct include a type
indicator. Such an indicator is called a tag, or discriminant, and a union with a discriminant is called a discriminated union.
© M. Winter
COSC 2P05 – Programming languages
4.*
F#
type intReal =
| IntValue of int
| RealValue of float;;
let ir1 = IntValue 17;;
let ir2 = RealValue 3.4;;
let printType value =
match value with
| IntValue value -> printfn “It is an integer”
| RealValue value -> printfn “It is a float”;;
© M. Winter
COSC 2P05 – Programming languages
4.*
Ada
Type Shape is (Circle, Triangle, Rectangle);
Type Colors is (Red, Green, Blue);
Type Figure (Form : Shape) is
record
Filled : Boolean;
Color : Colors;
case Form is
when Circle =>
Diameter : Float;
when Triangle =>
Left_Side : Integer;
Right_Side : Integer;
Angle : Float;
when Rectangle =>
Side_1 : Integer
Side_2 : Integer;
end case;
end record;
© M. Winter
COSC 2P05 – Programming languages
4.*
A pointer type is one in which the variables have a range of values that consists of memory addresses and a special value, nil. The value nil is not a valid address and is used to indicate that a pointer cannot currently be used to reference a memory cell.
A value of reference type is a reference to another value.
Design issues:
What are the scope and lifetime of a pointer variable?
What is the lifetime of a heap-dynamic variable (the value a pointer
references)?
Are pointers restricted as to the type of value to which they can point?
Are pointers used for dynamic storage management, indirect addressing, or both?
Should the language support pointer types, reference types, or both?
Pointer and Reference Types
© M. Winter
COSC 2P05 – Programming languages
4.*
Languages that provide a pointer type usually include two fundamental pointer operations: assignment and dereferencing.
Dereferencing of pointers can be either explicit or implicit.
In C++, it is explicitly specified with the asterisk (*) as a prefix unary operator.
j = *ptr
Pointer Operations
© M. Winter
COSC 2P05 – Programming languages
4.*
Dangling pointers
int * arrayPtr1;
int * arrayPtr2 = new int[100];
arrayPtr1 = arrayPtr2;
delete [] arrayPtr2;
// Now, arrayPtr1 is dangling, because the heap storage
// to which it was pointing has been deallocated.
Lost Heap-Dynamic Variables
A lost heap-dynamic variable is an allocated heap-dynamic variable that is no longer accessible to the user program (garbage).
Pointer Problems
© M. Winter
COSC 2P05 – Programming languages
4.*
A reference type variable is similar to a pointer, with one important and fundamental difference: A pointer refers to an address in memory, while a reference refers to an object or a value in memory.
It is not sensible to do arithmetic on references.
C++
int result = 0;
int &ref_result = result;
. . .
ref_result = 100;
Java reference variables can be assigned to refer to different class instances; they are not constants. All Java class instances are referenced by reference variables.
Reference Types
© M. Winter
COSC 2P05 – Programming languages
4.*
Problem of dangling pointers.
Problem of garbage.
Problems of heap management.
Allocation
Garbage detection
Deallocation
Pointers have been compared with the goto.
“Their introduction into high-level languages has been a step backward from which we may never recover.”
Hoare (1973)
Pointers are essential in some kinds of programming applications.
The references of Java and C# provide some of the flexibility and the capabilities of pointers, without the hazards
Evaluation
© M. Winter
COSC 2P05 – Programming languages
4.*
Type checking is the activity of ensuring that the operands of an operator
are of compatible types.
Any automatic conversion is called a coercion.
A type error is the application of an operator to an operand of an inappropriate type.
If all bindings of variables to types are static in a language, then type checking can nearly always be done statically.
Dynamic type binding requires type checking at run time, which is called dynamic type checking.
Type Checking
© M. Winter
COSC 2P05 – Programming languages
4.*
A programming language is strongly typed if type errors are always
detected. This requires that the types of all operands can be determined, either at compile time or at run time.
C and C++ are not strongly typed languages because both include union types, which are not type checked.
ML and F# are strongly typed.
Ada is nearly strongly typed, i.e., it allows programmers to breach the type-checking rules by specifically requesting that type checking is suspended for a particular type conversion.
Java and C# are nearly strongly typed.
Strong Typing
© M. Winter
COSC 2P05 – Programming languages
4.*
Type equivalence is a strict form of type compatibility; compatibility
without coercion.
Name type equivalence
Two variables have equivalent types if they are defined either in the same declaration or in declarations that use the same type name.
Structure type equivalence
Two variables have equivalent types if their types have identical structures.
type Celsius = Float; vs type Celsius is new Float;
type Fahrenheit = Float; type Fahrenheit is new Float;
Type Equivalence
© M. Winter
COSC 2P05 – Programming languages
4.*
Expressions are the fundamental means of specifying computations in a programming language.
Design issues:
What are the operator precedence rules?
What are the operator associativity rules?
What is the order of operand evaluation?
Are there restrictions on operand evaluation side effects?
Does the language allow user-defined operator overloading?
What type mixing is allowed in expressions?
Expressions
© M. Winter
COSC 2P05 – Programming languages
4.*
The operator precedence and associativity rules of a language dictate the order of evaluation of its operators.
Precedence
a + b * c
The operator precedence rules for expression evaluation partially
define the order in which the operators of different precedence levels
are evaluated.
Ruby C-Based Languages Ada
Highest ** postfix ++, — **, abs
unary +, – prefix ++, –, unary +, – *,/, mod, rem
*, /, % *, /, % unary +, –
Lowest binary +, – binary +, – binary +, –
Operator Evaluation Order
© M. Winter
COSC 2P05 – Programming languages
4.*
Associativity
a – b + c – d
When an expression contains two adjacent 2 occurrences of operators
with the same level of precedence, the question of which operator is
evaluated first is answered by the associativity rules of the language.
Left or right associativity
Language Associativity Rule
Ruby Left: *, /, +, –
Right: **
C-based languages Left: *, /, %, binary +, binary –
Right: ++, –, unary -, unary +
Ada Left: all except **
Nonassociative: **
© M. Winter
COSC 2P05 – Programming languages
4.*
if-then-else statement vs. conditional expression
if-then-else statement
if (count == 0)
average = 0;
else
average = sum / count;
Conditional expression
average = (count == 0) ? 0 : sum / count;
Conditional Expression
© M. Winter
COSC 2P05 – Programming languages
4.*
Order of evaluation of operands
If neither of the operands of an operator has side effects, then operand evaluation order is irrelevant. Therefore, the only interesting case arises when the evaluation of an operand does have side effects.
A side effect of a function, naturally called a functional side effect, occurs when the function changes either one of its parameters or a global variable.
Example:
int a = 5;
int fun1() { void main() {
a = 17; a = a + fun1();
return 3; } /* end of main */
} /* end of fun1 */
Operand Evaluation Order
© M. Winter
COSC 2P05 – Programming languages
4.*
A program has the property of referential transparency if any two expressions in the program that have the same value can be substituted for one another anywhere in the program, without affecting the action of the program.
The value of a referentially transparent function depends entirely on its parameters.
result1 = (fun(a) + b) / (fun(a) – c);
temp = fun(a);
result2 = (temp + b) / (temp – c);
There are several advantages to referentially transparent programs. The most important of these is that the semantics of such programs is much easier to understand than the semantics of programs that are not referentially transparent. Being referentially transparent makes a function equivalent to a mathematical function, in terms of ease of understanding.
Referential Transparency
© M. Winter
COSC 2P05 – Programming languages
4.*
Arithmetic operators are often used for more than one purpose.
For example, + usually is used to specify integer addition and floating-point addition.
Some languages (Java) also use it for string catenation.
This multiple use of an operator is called operator overloading and is generally thought to be acceptable, as long as neither readability nor reliability suffers.
Problems:
Using the same symbol for two completely unrelated operations is detrimental to readability (& in C++).
Input error may lead to “correct” program due to overloading.
Some languages allow the programmer to further overload operator
Symbols (C++, C#, F#).
Overloaded Operators
© M. Winter
COSC 2P05 – Programming languages
4.*
Type conversions are either narrowing or widening.
A narrowing conversion converts a value to a type that cannot store even approximations of all of the values of the original type.
Narrowing conversions are not always safe; sometimes the magnitude of the converted value is changed in the process.
A widening conversion converts a value to a type that can include at least approximations of all of the values of the original type.
Widening conversions are nearly always safe, meaning that the approximate magnitude of the converted value is maintained.
Can result in reduced accuracy.
Type conversions can be implicit or explicit.
Type Conversions
© M. Winter
COSC 2P05 – Programming languages
4.*
Coercion = implicit type conversion
Java (mixed mode expression)
int a;
float b, c;
…
c = b * a;
Because error detection is reduced when mixed-mode expressions are
allowed, F# and ML do not allow them.
Coercion in Expressions
© M. Winter
COSC 2P05 – Programming languages
4.*
Most languages provide some capability for doing explicit conversions,
both widening and narrowing.
C-based languages (casts)
(int) angle
Potential errors due to failed conversion or coercions of operands in expressions.
Floating-point overflow, underflow, and division by zero are examples of
run-time errors, which are sometimes called exceptions.
Explicit Type Conversion
© M. Winter
COSC 2P05 – Programming languages
4.*
The syntax of the relational operators for equality and inequality differs among some programming languages.
C-based languages use !=
Lua uses ~=
Fortran 95+ uses .NE. or <>
ML and F# use <>
JavaScript == vs ===:
“7” == 7 “7” === 7
yields True yields false
Relational and Boolean Expressions
© M. Winter
COSC 2P05 – Programming languages
4.*
Boolean expressions consist of Boolean variables, Boolean constants, relational expressions, and Boolean operators.
The precedence of the arithmetic, relational, and Boolean operators in the
C-based languages is as follows:
Highest postfix ++, —
unary +, unary -, prefix ++, –, !
*, /, %
binary +, binary –
<, >, <=, >=
=, !=
&&
Lowest ||
Boolean Expressions
© M. Winter
COSC 2P05 – Programming languages
4.*
A short-circuit evaluation of an expression is one in which the result is determined without evaluating all of the operands and/or operators.
(13 * a) * (b / 13 – 1)
(a >= 0) && (b < 10)
Problem with non-short-circuit evaluation of Boolean expressions.
index = 0;
while ((index < listlen) && (list[index] != key))
index = index + 1;
Side effects and short-circuit evaluation.
(a > b) || ((b++) / 3)
Short-Circuit Evaluation