Functional Programming with OCaml
CSE 216 : Programming Abstractions Department of Computer Science Stony Brook University Dr. Ritwik Banerjee
OCaml
• The main implementation of the Caml programming language, formerly known as Objective Caml.
Relatively recent language, developed in 1996.
Extends the original Caml language with object-oriented
features.
A strong, statically typed language.
A powerful language that has inspired other languages like Scala and F#.
A functional language.
Official website: https://ocaml.org/ has links to documentation, cheat sheets, tutorials, code examples, etc.
© 2019 Ritwik Banerjee
Installing and Running OCaml
• Installbyfollowinginstructionsavailablehere: http://www.ocaml.org/docs/install.html ← may take a while!
• Online OCaml interactive shell: https://try.ocamlpro.com/ • Filenameshavethe.mlextension.
• Runningtheread-eval-printloop,orREPL:
$ ocaml
OCaml version 4.02.3
# print_endline “Hello World!”;; (* ;; ends an expression *)
Hello World!
– : unit = ()
# #use “hello.ml”;; (* import a file, and explain in a comment *)
Hello World!
– : unit = ()
# (* #quit terminates the interactive shell, so does ‘exit 0;;’
or ctrl+D *)
#quit;;
© 2019 Ritwik Banerjee
Installing and Running OCaml From a .ml file*:
• Create“hello.ml”:print_endline“Hello World!”
• Runitwiththeinterpreter: $ ocaml hello.ml Hello World!
• Compileanativeexecutableandrun: $ ocamlopt –o hello hello.ml
$ ./hello
Hello World!
• Useocamlbuild:
$ ocamlbuild hello.native $ ./hello.native
Hello World!
* You may come across mentions of files with .mli extension. These are interface files, which we may cover later in the syllabus if time permits.
You can also feed the file into the interactive shell (this is called a batch interactive session, where the input commands are taken from the file:
$ ocaml < hello.ml OCaml version 4.02.3
# hello world
- : unit = ()
© 2019 Ritwik Banerjee
Code compilation
• OCaml offers two ways of compiling and running code:
1) bytecodecompilationusing ocamlc and run by the interpreter ocamlrun
2) nativecodecompilationusingthe compiler ocamlopt
• In this course, we will use the native code compilation because it
is more ‘standard’
provides standalone executables
is faster than bytecode compilation
• The ‘-c’ option is used to compile individual modules. It produces compiled files (with the .cmx extension) but no executable.
ocamlopt -c myfile.ml
• The‘-o’optionisusedtospecifythe name of the output file produced by the linking of the compiled codes.
ocamlopt -o program myfile.cmx
which will produce the executable named program or program.exe.
• Inthiscourse,wewillnotgotoofar into OCaml, so you should not worry about the linking of multiple modules, interfaces, etc.
© 2019 Ritwik Banerjee
Basic types and expressions
• floating-pointoperators (+.) must be explicit
• typeconversionsmustbe explicit (int_of_float)
• theunittypeissimilar to void in Java.
© 2019 Ritwik Banerjee
Basic operations and functions
+ - * / mod
+. -. *. /. **
ceil floor sqrt exp log log10 cos sin tan acos asin atan
not && || = <>
== !=
< > <= >=
integer arithmetic floating-point arithmetic
floating-point functions
boolean operators
structural comparison (polymorphic) physical comparison (polymorphic) comparisons (polymorphic)
© 2019 Ritwik Banerjee
Equality and comparisons
Physical equality compares pointers Structural equality compares values
TLDR: do NOT use the physical equality == in your programs.
© 2019 Ritwik Banerjee
Conditional expressions
if expr1 then expr2 else expr3
• The“else”partismandatory.
• Thetypeofexpr1mustbeboolean.
• Thetypesofexpr2andexpr3mustmatch.
© 2019 Ritwik Banerjee
Binding names using “let” • Wecansayletname=expr1in
expr2, which binds the name to the 1st expression within the scope of the
2nd expression only.
• Wecanalsosayletname=expr, which binds the name to the expression for the rest of the execution.
© 2019 Ritwik Banerjee
Binding names using “let”
• The‘let’keywordcanbeusedto bind in succession, since each binding is local.
• Butalwaysrememberthatthisis not an assignment! The name disappears as soon as the scope ends.
© 2019 Ritwik Banerjee
Binding names using “let”
• Again,‘let’isnotanassignment!
• OCamlpicksupbindingsthatwere in effect at the time of function definition. This is very different from using a name as a variable bound to a value.
• Inthisexample,itseesa=2here, and then again here.
• Definingthefunctionagainafter letting a = 20 changes the behavior.
© 2019 Ritwik Banerjee
Functions
• Functionsarefirst-classobjects in OCaml.
• Wedefineafunctionjustlike any other value can be defined.
• Wecanapplyafunctiontoits argument(s) immediately.
• Functionswithmultiple parameters are modeled as nested functions, each with a single parameter.
© 2019 Ritwik Banerjee
“let” is like a function application • Theexpressionletname=expr1inexpr2issemanticallyequivalentto
the expression (fun name -> expr2) in expr1.
• Bothmean“inexpr2,replacenamewiththeexpr1”. • Butletisusuallyeasiertoread:
# let a = 3 in a + 2;;
– : int = 5
# (fun a -> a + 2) 3;;
– : int = 5
© 2019 Ritwik Banerjee
Currying
• Didyounoticethetypeofthemultiplicationfunction?
fun x y -> x*y;;
• Ordinarily,wewouldexpectthetypetobe(int,int)->int
• Butrecallfromlambda-calculusthatafunctioncanonlytakeasingle argument. So we have the following expansion:
fun takes the argument x and returns a function (let’s call it fun_x).
fun_x then takes the argument y and multiplies y with x, yielding the final result.
• Thus,thetypeoffunisgivenbyint->int->int=
• Thistypeofevaluationofafunctionthattakesmultipleargumentsas evaluating a sequence of functions, each with a single argument, is called Currying (named after the mathematician Haskell Curry).
© 2019 Ritwik Banerjee
Recursive functions
OCaml
(* let allows for recursion *)
let rec gcd a b =
if a = b then a
else if a > b then
gcd (a-b) b
else
gcd a (b-a)
Java
int gcd(int a, int b) {
while (a != b) {
}
if (a > b)
a -= b;
else
b -= a; }
return a;
© 2019 Ritwik Banerjee
Recursive functions
• Bydefault,anameisnot visible in the expression it is getting bound to (i.e., its own definition).
• Thereckeywordallows for such self-referential visibility.
• Theandkeywordallows for mutual recursion.
© 2019 Ritwik Banerjee
Name them and pass them around as arguments.
First-class functions
© 2019 Ritwik Banerjee
Functions can return functions.
Higher-order functions
© 2019 Ritwik Banerjee
Data Types
© 2019 Ritwik Banerjee
The unit type and its () value
• The unit value () of type unit conveys no information: it is the unique value of its type.
• It serves the purpose of void (e.g., in in Java and C).
• No operations are possible on this type or its only value.
© 2019 Ritwik Banerjee
Tuples
# (42, “John”);;
– : int * string = (42, “John”)
# (42, “John”, “Doe”);;
– : int * string * string = (42, “John”, “Doe”)
# let p = (42, “John”);;
val p : int * string = (42, “John”)
# fst p;;
-: int = 42
# snd p;;
– : string = “John”
# let profile = (42, “John”, “Doe”, 177, “male”);;
val profile : int * string * string * int * string = (42, “John”, “Doe”, 177, “male”)
# let (_, _, lastname, _, sex) = profile in (lastname, sex);;
– : string * string = (“Doe”, “male”)
© 2019 Ritwik Banerjee
Lists
# [];; (* empty list, also called ‘nil’ *)
-: ’a list = []
# [1];; (* singleton list *)
-: int list = [1]
# [ [[]]; [[1;2;3]]];; (* nested list *)
– : int list list list = [[[]]; [[1; 2; 3]]]
# 7 :: [6; 5];; (* cons, ::, puts an element at the start *)
-: int list [7; 6; 5]
# [1; 2] :: [3; 4];;
Error: This expression has type int but an expression was expected of type int list
# [1; 2] @ [3; 4];; (* @ concatenated two lists *)
-: int list = [1; 2; 3; 4]
(* get the first item of a list *) # List.hd [1; 2; 3];;
-: int 1
(* get everything after the first item of a list *) # List.tl [1; 2; 3];;
-: int list [2; 3]
© 2019 Ritwik Banerjee
A note on the syntax for list & tuple
• Thesquarebracketsareconvenientbutnotrequiredforlists.
The list [1; 2; 3] is equivalent to 1::2::3::[].
• Alistcannotcontainamixtureofelementsofdifferenttypes:
# [[1; 2]; [[1; 2]]];;
Error: This expression has type ‘a list
but an expression was expected of type int
# [[1; 2]; [2; 3]];;
– : int list list = [[1; 2]; [2; 3]]
• Separatingbythesemi-coloncreatesalist.Separatingbycommacreatesa tuple. The parentheses are optional, but convenient.
# [“1”, 1, [1; 2]];;
– : (string * int * int list) list = [(“1”, 1, [1; 2])]
© 2019 Ritwik Banerjee
(Some) List functions
In Ocaml, lists are immutable. We cannot change an element of a list from one value to another. So, OCaml programmers create new lists out of old lists.
Using map and reduce operations, we can largely replace traditional loops:
• Applyafunctionftoeachelementofalisttotransformit:
List.map f [a1; …; an] = [f a1; …; f an]
• Applyafunctionfrecursivelytothecurrentresulttogetherwithanelement of the list, to finally produce a single element:
List.fold_left f a [b1; …; bn] = f(…(f(f a b1) b2)…) bn)
• Applyafunctionftoeachelementofalist,toproduceunitastheresult: List.iter f [a1; …; an] = begin f a1; … ; f an; () end
• Reversealist:
List.rev [a1; …; an] = [an; …; a1]
© 2019 Ritwik Banerjee
(Some) List functions
© 2019 Ritwik Banerjee
Enumerating list items
• Totransformalistandpassinformationfromoneelementtoanother,we
can use fold_left with a tuple:
# let (l, _) = List.fold_left (fun (l, n) e -> ((e, n) :: l, n+1))
([], 0) [1; 2; 3; 4] in List.rev l;;
-: (int * int) list = [(1, 0); (2, 1); (3, 2); (4, 3)]
• Here,thefunctionfun
takes a tuple and an element, and
returns a tuple,
where the first element is a list of pairs, and the second element is an integer.
• Ifyouapplysuchafunctiontoalist,anduseaninitialvalueof0inthe accumulator (i.e., the starting value of n),
the list gets transformed into pairs where each list element is paired with its index. but the accumulation using cons ends up inverting the order, so we use List.rev.
© 2019 Ritwik Banerjee
Pattern Matching
• Toaccessthevariouscomponentsofadatastructure,OCamlusesa powerful feature called pattern matching, which divides the process into various branches or cases.
• Forexample,tocomputethesumofintegersinalist: # let rec sum lst =
match lst with
| [] -> 0
| h::t -> h + sum t;;
val sum : int list -> int =
• Weareusinghandtfor‘head’and‘tail’,respectively.
• AnothercommonidiominOCamlistousexandxs(tothinkofthefirst
element as a variable ‘x’, and the rest of the list as multiple such ‘xs’):
| x::xs -> x + sum xs;;
© 2019 Ritwik Banerjee
Semantics of pattern matching • Patternmatchinginvolvestwotasks.Thesearetodetermine
1. whether or not a pattern matches a value, and
2. what parts of the value are to be bound to which variable names in the pattern.
• The 2nd task (i.e., bindings) needs a more careful analysis. The list is:
1) The pattern x matches a value v, and binds x -> v.
2) The pattern _ matches a value v, and does not bind to anything.
3) The nil pattern [] only matches the nil value [], and does not bind to anything.
4) If a pattern p1 matches a value v1, and produces a set of bindings b1, and a pattern p2 matches a value v2, and produces a set of bindings b2, then p1::p2 matches v1::v2. The set of bindings is the union b1 ∪ b2.
5) Similarly for lists, generalizing the above notation, the pattern [p1; …; pk] matches [v1; …; vk], and produces the bindings ⋃𝒌𝒊=𝟏 𝐛𝒊.
• Based on the above list, we can evaluate general pattern matching expressions of the type match e with p1 -> e1 | … | pn -> en;;
© 2019 Ritwik Banerjee
Example of bindings in a function definition that uses pattern matching
# let xor p =
match p with
| (false, false) -> false
| (false, true) -> true
| ( true, false) -> true
| ( true, true) -> false;;
val xor : bool * bool -> bool =
# xor (true, true);;
-: bool = false
• Notethat
in the first two cases, the output is the
same as the second item in the pair, and
in the last cases, the output is the negation of the second item in the pair.
• Wecanusevariablebindinghere:
# let xor p =
match p with
| (false, x) -> x
| ( true, x) -> not x;;
© 2019 Ritwik Banerjee
Wildcards
• Theunderscoresymbolisawildcardthatmatchesanything.
• Thisisoftenusedwhenthereisapartofthepatternwemustaccountfor, but we don’t care about that part. For example:
# let xor p =
match p with
| (true, false) -> true
| (false, true) -> true
| _ -> false;;
val xor : bool * bool -> bool =
© 2019 Ritwik Banerjee
Wildcards
• Theunderscoresymbolisawildcardthatmatchesanything.
• Thisisoftenusedwhenthereisapartofthepatternwemustaccountfor, but we don’t care about that part. For example:
let length = function
[] | [_]
-> “empty”
-> “singleton”
-> “pair”
| [_; _]
| [_; _; _] -> “triple”
| hd :: tl -> “many”;;
val length : ‘a list -> string =
© 2019 Ritwik Banerjee
A note on function syntax
• Bynow,youhaveprobablynoticedthat there are two different ways we can define a function in OCaml:
1. let {function_name} {args} = match … with {cases}
2. let {function_name} = function {cases}
• Thetwosyntaxesaresemantically identical (they produce the same assembly code after compilation).
For example, the two definitions shown here are identical.
let rec fib = function
| 0 -> 0
| 1 -> 1
| i -> fib (i-1) + fib (i-2);;
let rec fib n = match n with
| 0 -> 0
| 1 -> 1
| i -> fib (i-1) + fib (i-2);;
© 2019 Ritwik Banerjee
# let xor = function
| (false, x) -> x
| (x, true) -> not x;;
Warning: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched: (true, false)
val xor : bool * bool -> bool =
# let xor = function
| (false, x) -> x
| (true, x) -> not x
| (false, false) -> false;;
Warning: this match case is unused. val xor : bool * bool -> bool =
Semantics of pattern matching: static and dynamic
• Thesemanticswehavedescribedsofarare dynamic semantics. That is, every name is associated with a value at runtime.
• ButOCamlalsoperformssomestaticsemantic checks:
1. Type inference: the type of pattern matching expression must be valid.
2. Exhaustiveness: the compiler checks whether or not the cases guarantee that at least one of the cases will match any valid input expression.
3. Unused cases: the compiler checks if there is any redundancy (i.e., a case is not needed due to the previous cases already being exhaustive).
© 2019 Ritwik Banerjee
(* not tail recursive *)
let rec sum lst = match lst with
| [ ] -> 0
| x :: xs -> x + sum xs;;
val sum : int list -> int =
(* tail recursive *)
let rec acc s lst = match lst with
| [ ] -> s
| x :: xs -> acc (s+x) xs;;
val acc : int -> int list -> int =
let sum lst = acc 0 lst;;
val sum : int list -> int =
A recursive function is called tail recursive if does not perform any computation after the recursive call returns, and immediately returns to its caller the value of its recursive call.
Let us look at two implementations of the function to compute the sum of all the integers in a list:
Tail Recursion
•
•
© 2019 Ritwik Banerjee
Tail Call Optimization
• Recursionleadstoacallstack,whereeachelement corresponds to a function call that has started but not yet returned.
• Iftherecursionisatailrecursion,thecallerdoes absolutely nothing more than just “pass along” the result, once the caller receives it.
We can perhaps imagine it as follows: once the recursion stops, the entire stack just ‘collapses’ quickly, since the final value is just being passed down from the top of the stack, until the bottom stack frame is popped and the same value that was passed down all the way, is returned.
But if the value doesn’t change upon return from a recursive call, is there a point in maintaining the full stack?
• Manylanguagesthusoptimizefortailrecursion– instead of growing the call stack, the caller’s stack frame is simply replaced with that of the called function.
• Thisiscalledtailcalloptimization,anditreduces the space complexity of tail-recursive algorithms from 𝑂(𝑛) to 𝑂(1).
© 2019 Ritwik Banerjee
let edges = [(“a”, “b”); (“a”,”c”); (“a”,”d”); (“b”,”e”); (“c”,”f”); (“d”,”e”); (“e”,”f”); (“e”,”g”)];;
let rec successors v = function | [] -> []
| (s,t)::edges -> if s = v
then t::successors v edges
else successors v edges
val successors : ‘a -> (‘a * ‘b) list -> ‘b list =
# successors “a” edges;;
– : string list = [“b”; “c”; “d”] # successors “b” edges;;
– : string list = [“e”]
Example: depth-first search
© 2019 Ritwik Banerjee
1)
2)
In the pattern matching code, we are searching through a list of tuples, and checking if the first item is equal to the input vertex v.
In other words, we are filtering the list based on this condition.
From those filtered edges, we are collecting the second item of each tuple. So, a
functional way of implementing this would be:
let successors v edges =
let matching (s,_) = (s = v) in
List.map snd (List.filter matching edges);;
Example: depth-first search
let rec successors v = function | [] -> []
| (s,t)::edges -> if s = v
then t::successors v edges else successors v edges
© 2019 Ritwik Banerjee
let rec dfs edges visited = function
| [] -> List.rev visited
| v::vertices -> if List.mem v visited
then dfs edges visited vertices else dfs edges (v::visited)
((successors v edges) @ vertices);;
Example: depth-first search
# dfs edges [] [“d”];;
– : string list = [“d”;”e”;”f”;”g”]
# dfs edges [] [“a”];;
– : string list = [“a”;”b”;”e”;”f”;”g”;”c”;”d”]
© 2019 Ritwik Banerjee