程序代写代做 data structure Haskell jvm Java c++ Types & Type Systems

Types & Type Systems
CSE 216 – Programming Abstractions Department of Computer Science, Stony Brook University Dr. Ritwik Banerjee
© 2019 Ritwik Banerjee

Undefined programs
• Recall that using 𝜆-calculus, we were able to construct programs like 𝜆𝑥. 𝑥 𝑦 that have no defined semantics, and as such, are irreducible.
• Wehavenowaytorestrictthefunctionseither,andcouldendupwith meaningless programs because of this. For example, consider
𝜆𝑓.𝜆𝑥.𝑓 𝑥 𝑥
– This is a well defined program if the argument given to it is a function that returns another function. However, if we give it a function that returns a non-function value
𝜆𝑓.𝜆𝑥.𝑓 𝑥 𝑥 𝜆𝑦.𝑦
then the program reaches an erroneous ‘stuck’ state.
– Currently, the semantics don’t allow us to say something like 𝑓 can only be of the form 𝑓 ∷= 𝜆𝑥.𝜆𝑦.𝑥 𝑦.
© 2019 Ritwik Banerjee

• Inprogramming,asin𝜆-calculus,aninvariantissimplya condition that always remains true. For example,
– a loop-invariant is a condition that is true at the beginning as well as the end of each iteration of the loop; a for-loop may have an inequality as its invariant, such as a condition x ≤ 10.
– in the 𝜆-calculus program 𝜆𝑛. +*, the following condition is an invariant: 𝑛 must be a non-zero number.
• Designconsiderations:
1. How to write an invariant? That is, what is the ‘language’
of invariants?
2. Can (or should) invariants be inferred from a program, or must they be provided by the programmer?
3. Can (or should) an invariant be checked statically (i.e., before the program is run) or dynamically (i.e., at runtime)?
Invariants

Invariants
Consider this Java code:
• •
It takes a boolean expression and raises an error if the expression evaluates to false.
The ‘language’ of invariants is the same as the host language (in this case, Java). [design consideration 1]
Programmer must provide this invariant. [design consideration 2] Invariant checked at runtime. [design consideration 3]
Again, the ‘language’ is the same as the host language. Programmer must provide this invariant.
Invariant checked at compile-time.
But keep in mind that a single language may perform some type checks statically and other type checks dynamically.
We are using a built-in keyword “assert”.
• •
• • • •
There are other invariants as well: the type of the parameter must be double.
© 2019 Ritwik Banerjee

A data structure is a description of a set of data and its internal organization such that the set is considered as a single entity.
An abstract data type is a mathematical model of a data structure and its operations.
• The organization is meant to facilitate certain operations to be performed efficiently.
• E.g., a binary search tree allows logarithmic search
• We can think of an abstract data type as an interface, and a data structure as its concrete implementation.
Data type | Data structure
© 2019 Ritwik Banerjee

Data types
There are multiple ways in which we can formally think of data types:
• We can implicitly picture values from a domain. This is the denotational semantics approach.
• We can think of a type to be defined in terms of the internal structure of the data, where complex structures are described in terms of simpler constituents. This is the structural semantics approach.
• Along with the data, we also define a collection of operations that can be applied to objects of that type.
– In Java, for example, these are defined (but not necessarily implemented) in an abstract class or interface.
© 2019 Ritwik Banerjee

Data type as an abstraction
• At the lowest level, computers are, of course, un-typed systems. Everything at the machine- level is binary.
• With each layer of abstraction, it becomes very helpful to think of a ‘value’ having a type:
– integer: -3, -200, 5, 972
– float: -5.38, 2.7182, 3.224e3 – string: “CSE 216”
• In light of these simple examples, we can think of data types as
– a set of concrete objects with some specific properties, or
– a set of values that can be concretely constructed and represented, or
– a set of values together with a collection of operations that can be performed on the elements of that set.
A Java programmer’s way of thinking:
• these are ‘instances’ of the corresponding ‘classes’
… and these define the ‘interfaces’.
© 2019 Ritwik Banerjee

Data type as an abstraction
• Whenviewingthetypeofdataasasetofpropertiesoroperations,an interesting idea is that of duck typing. It is based on an idea from abductive reasoning:
If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.
• Ducktypingisessentiallyanapproachthatchecksthetypeofdataat runtime by looking at the runtime ‘behavior’ of the data.
© 2019 Ritwik Banerjee

Duck Typing
Any object can be used in any context, as long as its runtime behavior allows the operations.
class Duck:
def fly(self):
print(“Duck flying”)
class Airplane: def fly(self):
print(“Airplane flying”)
class Whale:
def swim(self):
print(“Whale swimming”) def lift_off(entity):
entity.fly()
duck = Duck()
airplane = Airplane()
whale = Whale()
lift_off(duck) # Duck flying
lift_off(airplane) # Airplane flying
lift_off(whale) # Error: ‘Whale’ object has no attribute ‘fly’
Python example from https://en.wikipedia.org/wiki/Duck_typing
© 2019 Ritwik Banerjee

Data type as “expressiveness”
• Thepurposeofcomputerprogramsistoperformcomputationsondata(orin other words, to process information).
• Information,ofcourse,comesinvariousforms.
• Therefore,thesetofdatatypesavailableinaprogramminglanguageisa good measure of how sophisticated the language is, in terms of its ease of information processing.
• Amore“expressive”languagecandealwithcomplextypesofinformation more easily.
• Alanguagethatallowstheprogrammertodefinenewdatatypeseasilycan become dynamically expressive as the need arises. Therefore, the ability to easily define new data types is an important component of expressiveness in a programming language.
© 2019 Ritwik Banerjee

Data Types as Descriptors
What can have a type?
• Thetypeofaconstantliteral(i.e.,a concrete value) is defined by which basic set it belongs to.
– E.g., ‘a’ is a character; 4.22 is float or double; 1 is an integer
• Thetypeofanexpressionisthetypeof the value it evaluates to.
• Thetypeofavariableisdefinedbythe type of the value/expression it refers to.
• Insomelanguages,subroutines,classes, and modules have their own types.
Everything “has a” type
• Wecanthinkconversely:thetypeofa construct (constant, variable, expression, etc.) is its attribute.
• Thisway,weviewthetypeofan object as a descriptor of that object.
• Thatis,theobject“hasa”datatype.
• Dependingonwhetheralanguageis statically and/or dynamically typed, the ‘type’ attribute is assigned* at compile-time and/or runtime.
*We will sometimes use the verb ‘assign’ when talking about data types. To make sense of such statements, you should keep in mind this idea of the data type as a descriptor.
© 2019 Ritwik Banerjee

Typed and untyped languages
• Programminglanguagesareoftenclassifiedaccordingtosomeofthemajor
programming paradigms – procedural, functional, and object oriented.
• Withineachparadigm,somelanguagesaretypedandothersareuntyped.
• Withincomputerscience,“untyped”reallymeansdynamicallytyped.
– That is, a variable or an expression is assigned the type of the corresponding data
(i.e., the “value” denoted by the variable or the expression) at runtime.
• Similarly, “typed” typically means statically typed.
• Withinobjectorientedlanguages,C++istypedandSmalltalkisuntyped.
© 2019 Ritwik Banerjee

Type system
• Atypesystemofaprogramminglanguageisasetofrulesthatassignsa data type to the constructs of a program in that language.
• Itprovidesasetofrulesfor(a)typeequivalence,(b)typecompatibility, and (c) type inference.*
• Itprovidesthe(sometimesimplicit)contextfortheoperationsonadatatype.
– Example 1: the binary operator “+” means concatenation if the arguments are
strings, but arithmetic addition if the arguments are numeric.
– Example 2: the unary operator “<<” in C++ is a 1-bit left-shift if the argument is an integer, but if the argument is an output stream, it means writing to the stream. – This is called operator overloading. *Before understanding these terms, though, we will need to first understand how types are checked in various languages. Thus, the next few slides divert to type checking and type errors. © 2019 Ritwik Banerjee Advantages of a type system • Documentation/legibility–typedlanguagesareeasiertoreadand understand since the code itself provides partial documentation of what a variable actually means. • Safety–typedlanguagesprovideearly(compile-time)detectionofsome programming errors, since a type system provides checks for type- incompatible operations. • Efficiency–typedlanguagescanpreciselydescribethememorylayoutof all variables, since every ‘instance’ of an ‘object’ of a certain type will occupy the same amount of space. – Except for dynamically resizing objects like a list. But even then, we at least know how much memory each ‘cell’ of the list will occupy. • Abstraction–typedlanguagesforceustobemoredisciplined programmers. This is especially helpful in the context of large-scale software development. © 2019 Ritwik Banerjee Type errors and type safety • Atypeerrorisaprogramerrorthatresultsfromtheincompatibleuseof differing data types in a program’s construct int n = 2.55; • Toprevent(oratleastdiscourage)typeerrors,aprogramminglanguage puts in rules for type safety. – Type safety contributes to a program’s correctness. – But keep in mind that it does not guarantee complete correctness. – Even if all operations in a program are type safe, there may still be bugs. – E.g., division of one number by another is type safe, but division by zero is unsafe unless the programmer explicitly handles that situation in some other manner. © 2019 Ritwik Banerjee Type checking • Typecheckingistheprocessofverifyingandenforcingtherulesoftype safety in a program. • Thismaybedoneatcompile-time,calledstatictyping(andthelanguageis called a statically typed language). • Or,itmaybedoneatruntime,whichisknownasdynamictyping(andthe language is called a dynamically typed language). • Anotherwaytodistinguishbetweenthetypecheckinginalanguageisbased on how strongly it enforces the conversion of one data type to another. – If a language generally only allows automatic type conversions that do not lose information, it is called a strongly typed language. – Otherwise, it is called weakly typed. © 2019 Ritwik Banerjee Type checking • Javaisstronglytyped,andusesamixofstaticanddynamictypechecking: String aString = 1; // static type check by javac int anInt = 10.0; // static type check by javac Phone phone = (Phone) o; // dynamic type check by JVM • Pythonisstronglytyped,andusesdynamictypechecking: astring = "2" anInt = 10 result = anInt + astring ← runtime type check and type error • Perlisweaklytyped,andusesdynamictypechecking: $a = "2" $b = 10 $a + $b // no error; implicit type conversion © 2019 Ritwik Banerjee Type checking Each approach to type checking has its dis/advantages: • Strongstatictypechecking + type errors are caught early at compile time − verbose code • Strongdynamictypechecking + quick prototyping with lesser ‘amount’ of code − type errors are caught only at runtime • Weakdynamictypechecking + least verbose code writing − type errors are often not caught even at runtime − unintended program behavior may occur due to implicit type conversion at runtime © 2019 Ritwik Banerjee Type equivalence • Aswementionedearlier,atypesystemprovidesasetofrulesfor(a)type equivalence, (b) type compatibility, and (c) type inference. • Atypecheckingsystemusuallyhasrulesoftheform: if two expressions have equivalent type then return that type else return TYPE_ERROR Type equivalence asks the key question needed for the above rule: When do two given expressions have equivalent types? There are two possible approaches: 1. Name equivalence: two types are equal if and only if they have the same constructor expression (i.e., they are bound to the same name) 2. Structural equivalence: two types are equivalent if and only if they have the same “structure”. © 2019 Ritwik Banerjee Type equivalence type student = record name, address : string age : integer type school = record name, address : string age : integer x : student; y : school; x = y; Name equivalence – If this hypothetical language uses name equivalence, the last line will lead to a type error. – Most modern languages (e.g., Java, C#) use name equivalence because they assume that – if a programmer has gone through the trouble of repeatedly defining the same structure under different names, – then s/he probably wants these names to represent different types. © 2019 Ritwik Banerjee struct { int a, b; } struct { int a; int b; } Type equivalence Structural equivalence – The exact definition of structural equivalence is a bit murky, and varies from one language to another. – Peoplehavedifferedoverquestionslike“whatreallyisa structure”, and “when should two structures be considered equivalent”. – The format, or course, doesn’t matter. The two code bodies on the top are equivalent types. – But what about the order of the components in a structure? – This depends on the language. ML, for example, treats them as equivalent types. struct { int a, b; } struct { int b, a; } © 2019 Ritwik Banerjee Type equivalence type student = record name, address : string age : integer type school = record name, address : string age : integer x : student; y : school; x = y; • Structuralequivalenceisasimplein theory, but things get complicated when we get recursive or pointer-based types. – It is, in some sense, a low-level (i.e., un- abstract) implementation-oriented way of distinguishing types. • Inourhypotheticallanguageexample, both student and school have the same fields, and the fields have the same types. – But the programmer clearly would like to treat them as two different types, even though they are structurally identical. © 2019 Ritwik Banerjee Name equivalence • Itissometimesagoodideatointroducesynonymousnames,e.g.,forbetter readability of programs: TYPE new_type = old_type; (* Modula-2 *) TYPE human = person; TYPE item_count = integer; – This is known as type aliasing, and the new type is called an alias of the old type. • Nameequivalencecanbe 1. strict (aliased types are considered to be equivalent) or 2. loose (aliased types are considered to be distinct). © 2019 Ritwik Banerjee Name equivalence • In this Modula-2 code, stack is meant to be an abstraction that allows for the creation of a stack of any desired type (in this case, integer). • If alias types were not considered equivalent, a programmer would have to replace every occurrence of stack_element with integer. • The language Modula-2 uses loose name equivalence to avoid this problem. – But this is probably better designed using generic types (e.g., in Java and C#). • Many modern languages use a ‘middle-ground’ approach between loose and strict name equivalence to indicate whether an alias represents a derived type. TYPE stack_element = integer; MODULE stack; IMPORT stack_element; EXPORT push, pop; ... PROCEDURE push(elem : stack_element); ... PROCEDURE pop() : stack_element; © 2019 Ritwik Banerjee Name equivalence • In other scenarios, aliased types should not be treated the same. • Using derived types (a.k.a. subtypes) resolves these issues. This brings forth the concept of a type hierarchy. • A subtype is type compatible with its parent type. This way, we have a one-sided “is-a” relation instead of complete type equivalence: – subtype “is-a” parent type (every celsius_temp is-a real number) – but parent is not a subtype (but not every real number is a celsius_temp) TYPE celsius_temp = REAL; fahrenheit_temp = REAL; VAR c: celsius_temp; VAR f: fahrenheit_temp; ... f := c; (* this should probably not be allowed *) © 2019 Ritwik Banerjee Type conversion In statically typed languages, the values expected in a context must be of a certain type, e.g., x := expression, both sides of the assignment must have the same type. If this expectation is violated, either the programmer needs perform an explicit type conversion (also called a type cast), or it will cause a type error. A. The types are structurally equivalent, but the language uses name equivalence. – In this case, the two structurally equivalent types have the same low-level representation, and the conversion happens implicitly, with no additional code. – We are simply re-interpreting the bits without changing the underlying implementation. This is an implicit non-converting typecast. /* Java example */ /* implicit non-converting cast when Student is a subtype of Person */ Person p = new Student(); © 2019 Ritwik Banerjee Type conversion In statically typed languages, the values expected in a context must be of a certain type, e.g., x := expression, both sides of the assignment must have the same type. If this expectation is violated, either the programmer needs perform an explicit type conversion (also called a type cast), or it will cause a type error. B. The types have different sets of values, but the intersecting values have a common representation. – Additional code must be executed at runtime to make sure that the value does, indeed, lie at the intersection. – This is an explicit non-converting typecast. /* Java example */ /* If the low-level implementation of ‘p’ is identical to the type ‘Student’ */ Student s = (Student) p; © 2019 Ritwik Banerjee Type conversion In statically typed languages, the values expected in a context must be of a certain type, e.g., x := expression, both sides of the assignment must have the same type. If this expectation is violated, either the programmer needs perform an explicit type conversion (also called a type cast), or it will cause a type error. C. The types have different low-level representations. Nonetheless, it is possible to define some obvious correspondence. For example: i. a 32-bit integer can be converted to a a double with no ambiguity. ii. a double can be converted to an integer (by rounding or truncating). – Most languages provide low-level machine instructions for such a conversion. Unlike the previous type conversions, every such conversion is a converting cast. /* Java example */ int k = (int) 3.14; // truncating a floating-point number double d = 3; // converting cast with no loss of precision © 2019 Ritwik Banerjee Type compatibility • Mostlanguagesrequireonlythattypesbe“compatible”,insteadof completely equivalent, e.g., – in an assignment statement, the type of the right-hand side must be compatible with the type of the left-hand side. – in an operation (say, +), the operands must be compatible with some common type that supports the ‘+’ operation (e.g., integers, floats, doubles, strings, or maybe even sets). – in a subroutine call, the types of 1. the arguments passed to the subroutine must be compatible with the types of that subroutine’s formal parameters, and 2. any formal parameters passed back to the caller must be compatible with the types of the corresponding arguments. • Theexactdefinitionoftypecompatibilityvariesalotbetweenlanguages. But the core is the that it poses the following question: When can a value of type A be used when type B is expected? © 2019 Ritwik Banerjee Type coercion • Wheneveralanguageallowsavalueoftype A to be be used when another type B is expected, the language implementation performs an automatic implicit conversion to the expected type B. – There are languages that don’t allow this at all (e.g., Haskell). – If the language allows this, the “how” is language-dependent, and which coercions are performed/allowed is also language-dependent. /* Type coercion in Java */ /* converting cast as coercion */ double d = 3; /* non-converting cast as coercion */ Object o = “type coercion example”; if (o instanceOf String) { } String s = (String) o; © 2019 Ritwik Banerjee