This Lecture
An Illustrative Identification Algorithm in Haskell
• LTXL Syntax and Semantics, particularly scope rules.
• Abstract syntax representation. CAST attributes of variablesI • Environment/Symbol Table representation and operations.
• The Identification Algorithm.
1
Recap: Identification
Identification: the task of relating each applied identifier occurrence to its declaration or definition:
In the body of set, the one applied occurrence of
• x refers to the instance variable x • n refers to the argument n.
2
Identification for LTXL
We are now going to study a concrete Haskell implementation of identification for LTXL:
Less Trival eXpression Language
• LTXL ⇡ TXL + typed definitions + if-expression + new operators
Note: The slides show only the highlights. The complete code, together with test files, are available on Moodle.
3
LTXL CFG (1)
LTXLProgram ! Exp
Eddison
Exp ! |
| | |
Exp||Exp | Exp&&Exp
Exp
Exp*Exp | Exp/Exp
PrimaryExp
4
LTXL CFG (2)
To disambiguate, operator precedence and associativity is used.
e
In increasing order of precedence:
knittedYu
1. ||
2. &&
3. <, ==, > 4. +, –
5. *, /
All left-associative.
5
LTXL CFG (3)
PrimaryExp
! LitInt
| Ident
| \ PrimaryExp 1negation 42 | – PrimaryExp Ex
| if Exp then Exp else Exp
| (Exp)
| let Defs in Exp
P
3 True
Babs
iP nP
6
LTXL CFG (4)
Defs ! |
Def !
Type ! |
Def ; Defs Def
Type Ident = Exp
int bool
7
LTXL Example 1
let
int a = 10;
bool b = a < 2
Scope indifferentlanguage isdifferent
in let
int c = a * 10; ke.in bool a = a == 42;
int d = if a then 1 else 2
9
in
if a && b then c else 42g
m scopeofboola
O
n
8
LTXL Scope Rules
gg Es.FoaK4 1. The scope of a variable is:
• all subsequent definitions and
• the body of the let-expression in which the definition of the variable
occurs.
A variable is not in scope in the right-hand side (RHS) of its own definition.
2. A definition of a variable hides, for the extent of its scope, any definition of a variable with the same name from an outer
HENRIzNHkgLk
let-expression.
3. For any variable, at most one definition may be given in the list of
definitions of a let-expression.
ABethebetsop
Fi 388922
9
LTXL Example 1 (again)
Which scope rules are used where?
10
LTXL Example 2
What about this LTXL example?
11
LTXL AST (1)
The following Haskell data types are used to represent LTXL programs.
type Id = String
data Type = IntType | BoolType
| UnknownType
data UnOp = Not | Neg
data BinOp = Or | And
| Less | Equal | Greater
| Plus | Minus | Times | Divide
12
LTXL AST (2)
keyinfo Trenet
type
13
LTXL AST (3)
Example: The LTXL program
let int x = 7 in x + 35
would be represented like this before identification (type Exp ()):
Let [("x", IntType, LitInt 7)]
(BinOpApp Plus
(Var "x" ())
(LitInt 35))
(After identification, type will be Exp Attr.)
14
LTXL Environment (1)
• An association list is used to represent the environment/symbol table to keep things simple.
• By prepending new declarations to the list, and searching from the beginning, we will always find an identifier in the closest containing
thefirstone youfindis the answer lookup "x" [("x",a1),("y",a2),("x",a3)]
) a1
• No need for a ”close scope” operation. We are in a pure functional setting ) persistent data.
scope. For example:
15
LTXL Environment (2)
The environment associates identifiers with variable attributes. Our attributes are the scope level and the declared type.
type Attr = (Int, Type)
The environment is just an association list:
type Env = [(Id, Attr)]
Note: This environment does not store variable values.
16
LTXL Environment (3)
Example:
let
int a = 10; (1) int b = a + 42
in let
bool a = b < 20 (2)
in
if a then b else 13
Env. after (1): [("a",(1,IntType))]
Env. after (2): [("a",(2,BoolType)), ("b",(1,IntType)),
("a",(1,IntType))]
17
LTXL Environment (4)
enterVar inserts a variable, at the given scope level and with the given type, into an environment.
enterVar :: Id -> Int -> Type -> Env
-> Either Env ErrorMsg
• Check that no variable with same name has been defined at the same scope level.
• If not, the new variable is entered, and the resulting environment is returned.
• Otherwise, an error message is returned.
18
Aside: The Haskell Type Either
The standard Haskell type Either is useful when one needs to represent a value that has one of two possible types:
dataEitherab=Lefta | Rightb
A typical example is when a function needs to return one of two kinds of results:
foo :: Int ! Either Bool String foox | x<100 =Left(x<0)
| otherwise = Right "Too big"
19
LTXL Environment (5)
enterVar i l t env
| not (isDefined i l env)
= Left ((i,(l,t)) : env) ! Declaration prepended | otherwise
= Right (i ++ " already defined.")
where
isDefined i l [] = False isDefinedil((i’,(l’,)): env)
btadeleteAs gz
2As !E Why?
| l < l’ = error "Should not happen!"
| l > l’ = False ! Why? b e u
| i == i’ = True
| otherwise = isDefined i l env
throbsBro3heBE
20
LTXL Environment (6)
Assume that:
env = [(“y”, (2,IntType)),
(“x”, (1,IntType))]
Then:
enterVar “x” 2 BoolType env ) Left [(“x”, (2,BoolType)),
(“y”, (2,IntType)),
(“x”, (1,IntType))]
enterVar “y” 2 BoolType env ) Right “y already defined.”
21
LTXL Environment (7)
lookupVar looks up a variable in an environment.
• Returns variable attributes if found. • Returns an error message otherwise.
lookupVar :: Id -> Env
-> Either Attr ErrorMsg
lookupVar i [] = Right (i ++ ” not defined.”)
lookupVar i ((i’,a) : env)
|i==i’ =Lefta
| otherwise = lookupVar i env
Note: Returns the first declaration found. Thus, later declarations become hidden naturally by the order of the list.
22
LTXL Environment (8)
Let
env = [(“x”, (2,BoolType)), (“y”, (2,IntType)), (“x”, (1,IntType))]
Then:
lookupVar “y” env
) Left (2,IntType)) lookupVar “x” env
) Left (2,BoolType)) lookupVar “z” env
) Right “z not defined.”
23
LTXL Identification (1)
Goals of LTXL identification phase:
identification ::
Exp () -> (Exp Attr, [ErrorMsg])
• Annotate each applied identifier occurrence with attributes of the corresponding variable declaration.
In other words, map unannotated AST Exp () to annotated AST Exp Attr.
• Report conflicting variable definitions and undefined variables.
24
LTXL Identification (2)
Example: Before Identification
Let [(“x”, IntType, LitInt 7)]
(BinOpApp Plus
(Var “x” ())
(LitInt 35))
After identification:
Let [(“x”, IntType, LitInt 7)]
(BinOpApp Plus
(Var “x” (1, IntType))
(LitInt 35))
25
LTXL Identification (3)
Main identification function:
identification :: Exp ()
! (Exp Attr, [ErrorMsg])
identification e = identAux 0 emptyEnv e
Type signature for auxiliary identification function:
identAux :: Int ! Env ! Exp ()
! (Exp Attr, [ErrorMsg])
26
LTXL Identification (4)
Variable case:
identAux l env (Var i _) = case lookupVar i env of
Lefta ! (Varia,[])
Right m ! (Var i (0, UnknownType), [m])
27
LTXL Identification (5)
Binary operator application (typical recursive case):
identAux l env (BinOpApp op e1 e2) = (BinOpApp op e1’ e2’, ms1 ++ ms2) where
(e1’, ms1) = identAux l env e1 (e2’, ms2) = identAux l env e2
28
LTXL Identification (6)
Reminder: LTXL scope rules:
1. The scope of a variable is all subsequent definitions and the body of the let- expression in which the definition of the variable occurs. A variable is not in scope in the RHS of its definition.
2. A definition of a variable hides, for the extent of its scope, any definition of a variable with the same name from an outer let-expression.
3. In the list of definitions of a let-expression, at most one definition may be given for any given variable.
29
LTXL Identification (7)
Block of definitions (let):
identAux l env (Let ds e) = (Let ds’ e’, ms1 ++ ms2) where
l’ = l + 1
(ds’, env’, ms1) = identDefs l’ env ds (e’, ms2) = identAux l’ env’ e
identDefs returns an updated environment (env’) to be used when checking the body (e) of the let (rule 1).
30
LTXL Identification (8)
31
E cient Symbol Table Implementation
Lists are not e cient symbol tables: Insertion (at head) is fast, O(1), but lookup is O(n), where n is the number of symbols.
Some more e cient options: • Balanced trees:
• Insertion and lookup are both O(log n).
• One way of handling nested scopes would be a stack of trees.
• Hash tables:
• Insertion and lookup are both O(1), as long as the ratio between the number of symbols and the hash table size is kept below a small constant factor.
• Algorithms such as linear hashing allow the table to grow and shrink gracefully, guaranteeing near optimal performance.
32