\begin{code}
{-# OPTIONS_GHC -Wall #-}
module Feb02 where
\end{code}
Learning objectives:
\begin{itemize}
\item edit distance: more on DSLs, interpreters, etc
\end{itemize}
Write some fun code in Haskell: Edit Distance! The distance between 2 strings,
counting the number of ‘edits’ needed from one to the other.
The number of ‘edits’ counts everything except Copy
Ex: fish -> chips
Language of edits:
\begin{code}
data Edit =
Insert Char — don’t move into the string at all
| Change Char — single char
| Copy — single char
| Delete — single char
| Kill — rest of line
deriving (Eq, Show)
run :: Edit -> String -> String
run (Insert c) s = c : s
run (Change _) [] = error “can’t change a character that is not there!”
run (Change c) (_ : s) = c : s
— run Copy [] = error “can’t copy a character that doesn’t exist”
— run Copy (x : s) = x : s
run Copy s = s
run Delete [] = error “can’t delete a character that doesn’t exist”
run Delete (_ : s) = s
run Kill _ = []
type Edits = [ Edit ]
— This is really a language interpreter
— Note: writing this was HARD. See the lecture video to see, as just reading
— the code won’t quite give you the process.
runs :: Edits -> String -> String
runs [] s = s
runs (Insert c : cs) s = c : runs cs s
runs (Change _ : _ ) [] = error “can’t change a character that’s not there”
runs (Change c : cs) (_ : s) = c : runs cs s
runs (Copy : _ ) [] = error “can’t copy no character”
runs (Copy : cs) (x : s) = x : runs cs s
runs (Delete : _ ) [] = error “can’t delete a non-existent character”
runs (Delete : cs) (_ : s) = runs cs s
runs (Kill : cs) _ = runs cs []
— This is more elegant… but doesn’t work. Hint: Kill
runs’ :: Edits -> String -> String
runs’ [] s = s
runs’ (c:cs) s = x : runs’ cs xs
where
(x: xs) = run c s
— Another attempt, which doesn’t _quite_ work
runs” :: Edits -> String -> String
runs” [] s = s
runs” (e:es) s = doit (run e s)
where
doit [] = []
doit (x:xs) = x : runs” es xs
ex1 :: Edits
ex1 = [Insert ‘c’, Change ‘h’, Copy, Insert ‘p’, Copy, Kill]
cost :: Edits -> Int
cost l = length $ filter (/= Copy) l
— cost = length . filter (/= Copy)
— this infers programs
transform :: String -> String -> Edits
transform [] [] = []
transform (_ : _ ) [] = [Kill] — needed [Kill], not Kill
transform [] ys = map Insert ys — this was missing!
transform (x : xs) (y : ys)
| x == y = Copy : transform xs ys
| otherwise = best [ Insert y : transform (x : xs) ys
, Change y : transform xs ys
, Delete : transform xs (y : ys)
]
best :: [ Edits ] -> Edits
best [] = error “best of nothing?”
best [z] = z
best (z : zs)
| cost z <= cost b = z
| otherwise = b
where
b = best zs
\end{code}
Examples: transform "kitten" "sitting", transform "Saturday" "Sunday"
both have cost 3
Bonus: prove runs (transform s s') s == s'
Double induction. "on paper". Formally (in Agda / Idris)
Also prove optimality.