Task
A compression scheme is a method for encoding of string of characters so that it takes up less space. One of the simplest examples is run-length encoding, which is useful for strings that contain long runs of repeated characters. To compress a string using this scheme, one simply replaces each such run by a single instance of the repeated character along with a count of the number of times it was repeated. For example, the string
aaaaabbbbcc
would be compressed to
a5b4c2
For simplicity, each character in the compressed string will be followed by a single digit, which means that runs of more than nine characters must be broken up into a sequence of runs of at most nine characters. For example, the string
dddddddddddd
would be compressed to
d9d3
The aim of this coursework is to write a Haskell script to compress and decompress a string using this form of run-length encoding.
Lines starting with > are lines typed into the the ghci REPL. They are not lines which should appear in your Haskell script.
Compression
1 Define a function chomp :: String -> String
2
that selects a run of repeated characters from the start of a string, with the run being as long as possible. For example: > chomp “aaaaabbbbcc”
3 “aaaaa”
4 > chomp “dddddddddddd”
5 “dddddddddddd”
6
7 Using chomp, define a function munch :: String -> String
8
that selects a run of repeated characters from the start of a string, with the run comprising at most nine characters. For example: > munch “aaaaabbbbcc”
9 “aaaaa”
10 > munch “dddddddddddd”
11 “ddddddddd”
12
13 Using munch, define a function runs :: String -> [String]
14
that splits a string into a list of runs of repeated characters, with each run comprising at most nine characters. For example: > runs “aaaaabbbbcc”
15 [“aaaaa”,”bbbb”,”cc”]
16 > runs “dddddddddddd”
17 [“ddddddddd”,”ddd”]
18
19 Using runs, define a function encode :: String -> [(Char,Int)]
20
that transforms a string into a list of pairs comprising the character from each run together with its number of repetitions. For example: > encode “aaaaabbbbcc”
21 [(‘a’,5),(‘b’,4),(‘c’,2)]
22 > encode “dddddddddddd”
23 [(‘d’,9),(‘d’,3)]
24
25 Define a function flatten :: [(Char,Int)] -> String
26
that flattens a list of pairs of characters and digits to a string. For example: > flatten [(‘a’,5),(‘b’,4),(‘c’,2)]
27 “a5b4c2”
28 > flatten [(‘d’,9),(‘d’,3)]
29 “d9d3”
30
31 Using encode and flatten, define a function compress :: String -> String
32
that compresses a string using run-length encoding. For example: > compress “aaaaabbbbcc”
33 “a5b4c2”
34 > compress “dddddddddddd”
35 “d9d3”
36
Decompression
7 Define a function decode :: [(Char,Int)] -> String
8
that performs the inverse function to encode. For example: > decode [(‘a’,5),(‘b’,4),(‘c’,2)]
9 “aaaaabbbbcc”
10 > decode [(‘d’,9),(‘d’,3)]
11 “dddddddddddd”
12
13 Define a function expand :: String -> [(Char,Int)]
14
that performs the inverse function to flatten. For example: > expand “a5b4c2”
15 [(‘a’,5),(‘b’,4),(‘c’,2)]
16 > expand “d9d3”
17 [(‘d’,9),(‘d’,3)]
18
19 Using decode and expand, define a function decompress :: String -> String
20
that performs the inverse function to compress. For example: > decompress “a5b4c2”
21 “aaaaabbbbcc”
22 > decompress “d9d3”
23 “dddddddddddd”
24
Testing
10 QuickCheck is a simple but very useful system for testing properties of Haskell programs using a large number of randomly generated inputs. To use the system, just add the following line to the start of your Haskell script: import Test.QuickCheck
11
Now add the following property to the end of your script, which states that compression followed by decompression always returns the original string: inverse :: String -> Bool
12 inverse xs = decompress (compress xs) == xs
13
You can now test this property using QuickCheck: > quickCheck inverse
14 +++ OK, passed 100 tests.
The inverse function takes a randomly generated input of the specified type (String in this case) and should return True or False depending if some property hold for that input. In this case, the property that is being checked is that compressing and then decompressing the input is the same as the original input. The quickCheck function is part of the QuickCheck package and randomly generates 100 inputs of the required type and calls your function with them, checking that it returns True each time. Define one other property of the other functions you have written in this coursework, call it myprop, and test it holds using QuickCheck. The property must be something sensible and useful, not something trivial like a non-empty string always compresses to a non-empty string.
Note: even if you do not implement another property, you must include the import and inverse function in your script.
/docProps/thumbnail.jpeg