程序代写代做代考 ocaml compiler file system data structure Java flex prolog Haskell cache chain Advanced Programming 2018 – Haskell, Continued

Advanced Programming 2018 – Haskell, Continued

Advanced Programming 2018
Haskell, Continued

Andrzej Filinski
andrzej@di.ku.dk

Department of Computer Science
University of Copenhagen

September 6, 2018

1 / 28

Today’s topics

I Introduction to some more advanced, Haskell-specific features

I Modules
I Type classes
I Laziness

I Equational reasoning

I Functional I/O principles
I List comprehensions

I Useful to know about in own right.

I Provide important background for monads, next time.

2 / 28

Haskell’s module system

I Relatively simple, compared to, e.g., Standard ML or OCaml

I But quite sufficient for most practical purposes
I Especially in conjunction with type classes

I Cover many (but not all) uses of ML’s parameterized modules

I Two main purposes:
I Namespace management

I Using same name for unrelated purposes at different points in big
program

I Abstraction management
I Preventing unwanted exposure of implementation details

I Fundamental concepts: imports and exports.

3 / 28

Standard modules

I All Haskell code is type-checked and executed in context of
some existing definitions of types and values.

I Most common definitions always visible: “standard prelude”.
I Saw several examples last time: pi, (+), map, [], Maybe, …

I Large standard library of further functionality available:
I Utility functions and data structures:

I E.g., formatting, parsing, finite-set operations, …
I Could in principle by reimplemented by ordinary programmer.
I But probably not as competently: don’t re-invent the wheel!

I System interface and control functions:
I E.g., directory listing, exception handling, …
I Implementation relies on special support from compiler and/or

runtime system.
I No way to re-implement from scratch in pure Haskell code.

I Grouped into modules.

4 / 28

Importing from modules

I To use all or parts of a module, must explicitly import from it.

I “import …” declaration(s) must be at very beginning of file.
I Bulk import:

I import System.Directory

I Makes everything from module available.
I Names may clash with own definitions, or other imports.

I Only get error on attempted use of ambiguous name.

I Normally used for “framework” modules, such as parser
combinators

I Selective import
I import System.Directory

(getCurrentDirectory, doesFileExist)

I Only makes explicitly listed names available.
I Remember to enclose any operator names in parentheses.
I Normally preferred if only need a few, unrelated functions from

module in question.

5 / 28

Importing from modules, continued

I Qualified import
I import qualified Data.Set as S

I Like a bulk import, but prefixes all imported names with S.
I S.map :: Ord b => (a -> b) -> S.Set a -> S.Set b

I Avoids clash with (list-based) map from standard prelude

I Warning: top-level interactive loop is a bit special.
I Can refer directly to names from other modules:

> System.Directory.getCurrentDirectory
“/home/andrzej/teaching/ap2018/”

I Will not work in file; need explicit import first.
I Prompt in top-level loop (initially “Prelude>”) indicates which

modules have been imported.
I Can add or remove with :mod [+/-] ModName

6 / 28

Creating your own modules

I Start file containing related definitions with
module ModName (exports) where defs

I ModName is the name of the module.
I Should be the same as source filename (without trailing .hs).

I exports is comma-separated list of names (types and/or values)
to be made available to users (clients) of the module.

I Often more readable to list one name per line.
I Use TypeName(..) to export a datatype together with all its

constructors.
I Omit export list entirely (including parens) to bulk-export

everything defined in module.

I defs should start with any needed import declarations, as usual.

7 / 28

What to export from a module?

I Not specific to Haskell; general principles for API design.

I Export orthogonal set of functions useful to clients, not any
internal “helper” functions you used to define them.

I If you cannot concisely summarize what a function does, it
shouldn’t be exported.

I Arguably, it probably shouldn’t even have been defined in the
first place…

I Unclear and/or complex specifications for internal functions are a
magnet for bugs.

I Do try to formulate specification (including meanings of all
parameters!) in a comment; forces you to consider what the
function should be doing.

I Surprisingly feasible in Haskell, because type of function says
everything about its possible dependencies and interactions with
rest of system.

8 / 28

Example of API considerations

I Suppose we are defining a module for integer-set operations,
with exports:

empty :: IntSet
singleton :: Int -> IntSet
union :: IntSet -> IntSet -> IntSet
member :: Int -> IntSet -> Bool

I For implementing union, may also have defined:

insert :: Int -> IntSet -> IntSet

Should it be exported? Maybe not.

I Client could themselves define nominally equivalent function:

myInsert x s = singleton x `union` s
Should be almost as efficient, uses only core operations.

I If myInsert significantly slower than insert, maybe should
improve performance of union in general.

I E.g., always add elts of smaller set to larger, not vice versa.

9 / 28

Preventing leakage of implementation details

I Suppose we implement IntSet as unsorted, duplicate-free lists.
I Could just make definition in module:

type IntSet = [Int]

But that exposes to clients that an IntSet is actually a list.
I In particular, this could evaluate to False:

singleton 3 `union` singleton 4 ==
singleton 4 `union` singleton 3

I Solution: in implementation, define a new type, equivalent to [Int].

newtype IntSet = IS {unIS :: [Int]}

I Almost same as data with a single constructor.
I Note: did not include deriving Eq in definition!

I Export type IntSet, but not constructor IS, nor projection unIS
I Only use internally in module, to define empty, union, etc.

I Clients can neither create new IntSet values, nor inspect existing
ones, except through exported API functions.

I But then, API should probably also include an equality test.

10 / 28

Overloading in Haskell

I Have already seen (sometimes implicit) examples of restricted
polymorphic functions:

(==) :: Eq a => a -> a -> Bool
(+) :: Num a => a -> a -> a
show :: Show a => a -> String

I Haskell’s type inferencer automatically keeps track of restrictions:

> let twice x = x + x
> :t twice
twice :: Num a => a -> a

I In general, may have multiple constraints:

foo :: (Num a, Show a) => a -> String
foo x = show (x + x)

I Capture a uniform notion of overloading, where computation to be
performed depends materially on types of operands and/or result.

11 / 28

Type classes

I A Haskell type class is an (open-ended) collection of types
supporting a fixed set of operations.

I Not entirely unlike interfaces in Java.

I Declared with class ClassName typevar where decls
I As usual, the decls should align vertically

I Several predefined classes, including (slightly simplified):

class Show a where
show :: a -> String

class Eq a where
(==), (/=) :: a -> a -> Bool

class Num a where
(+), (-), (*) :: a -> a -> a
fromInteger :: Integer -> a

I Use :info ClassName in GHCi to see full list of operations.

12 / 28

Declaring class membership

I To include a (new or previously defined) type in a class, must
add an instance declaration.

I Simply need to supply all the required operations of the class.

I Example (of course, better version exists in standard library):

data Complex = Complex {re, im :: Double}

instance Num Complex where
(Complex r1 i1) + (Complex r2 i2) = Complex (r1+r2) (i1+i2)

fromInteger n = Complex (fromInteger n) 0.0

I Note: The fromInteger n on the RHS is not a recursive call, but
an invocation of fromInteger :: Integer -> Double !

I Likewise,

instance Show Complex where
show c = show (re c) ++ “+” ++ show (im c) ++ “i”

13 / 28

Numeric types in Haskell

I Actually, whole hierarchy of numeric type classes
I Num a, for types a that have operations (+), (-), (*)

I Mathematically: ∼ rings
I Fractional a, for types a that also have (/)

I Mathematically: ∼ fields
I Integral a, for types a that have div, mod

I instances: Int, Integer, …

I …

I Main oddity: even literals are overloaded!
I Plain 42 actually behaves like fromInteger (42::Integer),

I Therefore:
I OK: pi + 1 — 1 can have type Double
I Not OK: pi + length “x” — length s has only type Int
I OK: pi + fromIntegral $ length “x” — explicit coercion

I Aside: $ often useful to avoid deeply nested parentheses
I Just a right-associative infix application operator.

14 / 28

More type-class constructions

I Class inheritance
I Can also constrain type variable in class declaration

class Bar a => Foo a where …
I Can only declare a type to be instance of Foo, if it’s already an

instance of Bar.
I Ex: class Eq a => Ord a where (<) :: a -> a -> Bool; …

I Default implementations
I Can include a default definition of a class operation:

class Eq a where
(==), (/=) :: a -> a -> Bool
x /= y = not (x == y)

I In instance declaration, if we omit definition for (/=), the default
one is used

I Note: default implementation may use operations of superclass.

I Both features a bit esoteric, but recent API change for Monad
class makes them unavoidable…

15 / 28

Automatically deriving instances

I Haskell can automatically construct certain instance
declarations for newly defined types.

I Only for a few built-in classes (need compiler support)

I data MyType = … deriving (Eq, Show, Read, …)
I Derived Show:

I Displays values in a format parseable as source code.
I E.g., “Complex {re = 3.0, im = 4.2}”
I Whereas our custom show would return “3.0+4.2i”

I Derived Eq:
I Structural equality (assuming all constituent types have Eq

instances.
I Usually fine, but sometimes want a coarser notion of equality.
I E.g., in our module implementing IntSet:

instance Eq IntSet where
(IS xs) == (IS ys) = all (\x -> x `elem` ys) xs &&

all (\y -> y `elem` xs) ys

16 / 28

Monoids

I Another common class: types with notion of “accumulation”
class Monoid a where
mempty :: a
mappend :: a -> a -> a

instance Monoid String where
mempty = “” ; mappend = (++)

instance Monoid Int where
mempty = 0 ; mappend = (+)

instance (Monoid a, Monoid b) => Monoid (a,b) where
mempty = (mempty {-of type a-}, mempty {-of type b-})
mappend (a1,b1) (a2,b2) = (mappend a1 a2, mappend b1 b2)

I All Monoid instances a should satisfy, for all x, y, z :: a

mappend mempty x ‘ x, mappend x mempty ‘ x,
mappend x (mappend y z) ‘ mappend (mappend x y) z

17 / 28

Constructor classes

I Can also classify type constructors (parameterized types).

I Example: functors, for “container-like” type constructors

class Functor f where
fmap :: (a -> b) -> f a -> f b

instance Functor [] where — type [a] stands for [] a
fmap = map

data Tree a = Leaf a | Node (Tree a) (Tree a)
instance Functor Tree where

fmap f (Leaf a) = Leaf (f a)
fmap f (Node tl tr) = Node (fmap f tl) (fmap f tr)

Then, fmap odd $ Node (Leaf 2) (Node (Leaf 3) (Leaf 5))
evaluates to Node (Leaf False) (Node (Leaf True) (Leaf True))

I All Functor instances should satisfy (where . is function composition):
fmap id ‘ id, fmap (g . f) ‘ fmap g . fmap f

18 / 28

Laziness

I Unlike most languages, Haskell has a lazy (∼ non-strict)
semantics.

I Subexpressions not evaluated until their values actually needed.
I To illustrate behavior, undefined is a predefined expression that

causes a runtime error when evaluated.
I Sample interaction:

> let x = undefined in x + 1

*** Exception: Prelude.undefined
> let x = undefined in 3
3

I Even if everything terminates (eventually), lazy evaluation may avoid
wasting work: let x = bigExp in 0

I But in let x = bigExp in x+x, Haskell will memoize (≈ cache) value
of x after first use, to avoid recomputation.

I Only safe because bigExp cannot have side effects!
I Same behavior for function arguments (“call-by-need”)

let f x = 42 in f undefined — immediately returns 42
19 / 28

Lazy evaluation, continued

I Even when result of subexpression is used, it will only be
evaluated enough to allow computation to proceed:

let p = (undefined, 3) in snd p — returns 3

case Just undefined of
Nothing -> False ; Just x -> True — returns True

I In general, evaluation of all constructor arguments (including tuples
and list nodes, but not newtype) is delayed.

I Can inadvertently construct “booby-trapped” values that only
explode when accessed.

I Commonly: only when being printed as results.

> let l = [10,20,undefined,40] in (length l, show l)
(4,”[10,20,*** Exception: Prelude.undefined

I The top-level printer is forcing evaluation.
I Apocryphal lecture by Simon Peyton Jones (GHC developer):

“This is a talk about lazy evaluation. Are there any questions?”

20 / 28

Streams

I In most practical situations, lazy vs. eager evaluation of
functional program makes no difference.

I Rare to write a subexpression, then never use its result (dead
code)

I But lazy evaluation makes it particularly simple and natural to
work with infinite lists (streams).

I Just like functions can be recursively defined, so can list values:

ones, nats :: [Int]
ones = 1 : ones
nats = 0 : map (\x -> x+1) nats

I > take 5 nats prints [0,1,2,3,4]

I > drop 5 nats prints [5,6,7,8,9,10,11,… until interrupted.

I Again, the top-level printer drives the actual computation.

21 / 28

Equational reasoning

I Most formal and semi-formal reasoning about Haskell programs
is about equivalence of expressions.

I When e1 and e2 are of same type, will write e1 ‘ e2 when they
are equivalent.

I Equivalent expressions mean the same thing.
I x + y ‘ y + x (for x,y :: Int),
I [x, 3] ++ xs ‘ x : 3 : xs (for x :: Int, xs :: [Int])
I map g . map f ‘ map (g . f) (for f :: a -> b, g :: b -> c)

I Special case: evaluation of complete expressions to values
I E.g. 2+2 ‘ 4, reverse [1,2,3] ‘ [3,2,1]

I Fundamental principle: can replace equivalent subexpressions
for each other, without affecting meaning of program

I E.g., let y = 2+2 in map (\x -> x + y) ‘
let y = 4 in map (\x -> y + x)

22 / 28

Equational reasoning, continued

I How to argue that two expressions are equivalent?

I Small collection of general principles, including:
I ‘ is reflexive, transitive, and symmetric..

I Can compactly write reasoning chains e1 ‘ e2 ‘ · · · ‘ en
I When definition x = e (local or global) is in scope, then x ‘ e.

I E.g., let x = 2+3 in x*x ‘ (2+3)*(2+3) — note parens
I E.g., let x = undefined in 0 ‘ 0
I Also works for patterns p on LHS (with caveat for _)

I If C e1 e2 ‘ C e ′1 e

2, then e1 ‘ e


1 and e2 ‘ e


2.

I E.g. if ([x], y) ‘ ([3], undefined), then [x] ‘ [3] (hence x
‘ 3) and y ‘ undefined.

I (\p -> e1) e2 ‘ let p = e2 in e1
I (\x -> x+1) (y*2) ‘ let x = y*2 in x+1 ‘ y*2+1

I Usual arithmetic equations (but beware of potentially undefined
subexpressions)

I x + x ‘ 2 * x, x * 0 6’ 0 (consider x = undefined)

23 / 28

Introduction to Haskell I/O

I Haskell is a completely pure language, no side effects allowed.

I So how can we possibly write Haskell programs that interact
with the real world?

I File system, terminal, network, other OS services,….

I Answer: top-level printer itself doesn’t need to be pure!

I Can have pure program compute a lazy list (stream) of I/O
requests (actions) for top-level printer to perform.

I Producing the list itself is effect-free; obeying it is not.
I The list is inspected incrementally, as and when the program

produces it.

I Actually need a datatype slightly more complicated than a list,
to allow pure program to also receive input from the outside
world.

24 / 28

A SimpleIO type constructor

I Simplified version of actual Haskell IO type constructor.
I Three-way choice:

data SimpleIO a = Done a
| PutChar Char (SimpleIO a)
| GetChar (Char -> SimpleIO a)

I Sample value of type IO Int, ready to be performed:
PutChar ‘?’ (GetChar (\c -> PutChar (toUpper c) (PutChar ‘!’ (Done (ord c)))))

I Top-level loop has following conceptual structure:
I If top-level expression has an “ordinary” (non-SimpleIO) type,

just evaluate it and print the result (incrementally).
I If expression has type SimpleIO a, evaluate it enough to expose

top constructor:
1. If of the form Done x, evaluate and print x, like in previous case
2. If of the form PutChar c s, output c, and continue evaluating s.
3. If of the form GetChar f , input a c, and continue evaluating f c.

I But how do we write a big program of type, say, SimpleIO ()?
I Seems awkward to generate all IO requests in functional style.
I Next time: monads to the rescue!

25 / 28

List comprehensions

I Cute Haskell feature, allows many list-processing functions to
be written clearly and naturally.

I Inspired by mathematical notation for set comprehensions:
I subset: {x | x ∈ {2,3,5,7} ∧ x > 4} = {5,7}
I direct image: {x + 1 | x ∈ {2,3,5,7}} = {3,4,6,8}
I Cartesian product: {(x, y) | x ∈ {2,3} ∧ y ∈ {>,⊥}} =

{(2,>), (2,⊥), (3,>), (3,⊥)}
I general union: {x | s ∈ {{2,3}, ∅, {5}} ∧ x ∈ s} = {2,3,5}

I Can write Haskell expressions with almost same notation:
I [x | x <- [2,3,5,7], x > 4] == [5,7]
I [x + 1 | x <- [2,3,5,7]] == [3,4,6,8] I [(x,y) | x <- [2,3], y <- [True,False]] == [(2,True), (2,False), (3,True), (3,False)] I [x | s <- [[2,3],[],[5]], x <- s] == [2,3,5] 26 / 28 List comprehensions, continued I Can even use all idioms on previous page together. > [100 * x + y | x <- [1..4], x /= 3, y <- [1..x]] [101,201,202,401,402,403,404] I General shape: [exp | qual1, ...,qualn], where each quali is: I a generator, x <- lexpi , where lexpi is a list-typed expression; or I a guard, bexpi , which must be a Bool-typed expression. I Qualifiers considered in sequence, from left to right: I For each generator, bind variable to successive list elements, and process next qualifiers (∼ foreach-loop in imperative language) I For each guard, check that it evaluates to True; otherwise, return to previous generator (∼ conditional continue in imperative). I When all qualifiers successfully considered, evaluate exp and add its value to result list. I Aka. depth-first search, backtracking, generate-and-test I Will see again in Prolog, parsing I Also an instance of programming with monads! 27 / 28 What now? I Talk to a fellow student about forming a group (two is max) I Attend labs after lunch (rooms A101–A105 at HCØ), from 13:05 I Don’t need to come on time: no scheduled activities I Section and room assignments should be on Absalon; otherwise do ad-hoc load balancing. I Work on Exercise Set 0 I Solve Assignment 0, due 20:00 on Wednesday, 12 September I Use Absalon forum for questions after the lab hours I Next lecture: monads! I Recommended reading materials posted I Pre-lecture quiz will be up soon (Sunday at latest) 28 / 28