Important information about this assignment
The report generated by the grader for this assignment is incomplete. A maximum of 70 points are attainable with the grader. The full grade for this assignment will be based on these tests, additional hidden tests and manual checkup of your code. Therefore, make sure to test your code appropriately on your own!
Introduction
This week’s topics include higher-order functions, programming with mutables and memoization. The following resources might be useful : the References section of Stdlib and the Hashtbl module.
Please check our complete submission guideline here
Question 1 : List higher-order functions (25 %)
Given the abundance of uses for lists, it is important to understand the utility of higher-order functions operating on lists, as it allows for a more generic codebase that can do anything on lists. For this section, your job is to implement two List module higher-order functions on your own.
find_map : (‘a -> ‘b option) -> ‘a list -> ‘b option takes a function and a list. It then iterates through the list, applying the function to each element. Once the argument function returns Some v for some element, return that; otherwise (aka if the function returned None for all elements of the list), return None. Your implementation has to be tail-recursive.
partition : (‘a -> bool) -> ‘a list -> ‘a list * ‘a list takes a predicate and a list, and splits the latter into two separate lists, composed of elements that pass and fail the test respectively. The order of elements should be maintained as in the input list. You must use a List.fold_* function – think which one makes your code easier to write!
Question 2 : Password Manager (40 %)
In the past few years, Internet users have become much more conscious of its inherent dangers. Many companies now focus their business model and marketing on the importance of digital privacy and security. One of the simplest tools to promote this is a password manager – and your task for this part is to implement a function that can create instances of password managers (effectively, password manager accounts).
Food for thought
Password managers help mitigate a lot of common tactics of social hacking. Using a comparison between the expected and actual address of a website makes sure humans don’t fall for look-alike or imitation websites (which is a type of phishing). Having the password manager auto-paste your password into a password field can protect you against keyloggers. However, password managers, when badly implemented, can be an even larger threat as they store a lot of passwords! From web flaws like XSS and CSRF to insufficient encryption of those passwords, designing a good password manager is a tough challenge.
Here is a paper that studied 5 popular password managers and revealed vulnerabilities in each: [The Emperor’s New Password Manager: Security Analysis of Web-based Password Managers]
Observe the prelude section for question 2. You are given exceptions, types, and encrypt/decrypt functions without their implementation. The type for the the function you must implement is make_manager : masterpass -> pass_manager. Its job is to initialize the references required for the operation of the pass_manager record, give the implementation of the functions composing it, and return a brand-new record.
The 5 functions composing the record are the following:
save takes the master password, as well as an address and a password corresponding to that address (from here on, refered to as the local password). If the master password matches, the function adds the address and the encrypted local password to a list. You do not need to check whether a password for such an address already exists.
get_force takes the master password as well as an address. The function uses find_map to return a local password for that address, if one is recorded. Don’t forget to use the decrypt function to return the password in its plaintext form! Note: this function should NOT verify whether the master password is correct, and as a result it should NOT count this as a failed or successful attempt as per the description of password locking below.
get takes the master password as well as an address. If the master password matches, it calls get_force. Simple!
update_master takes, in order, the master current password and a new password. If the master password matches, it is overwritten with the new password for all future operations. Also, all saved passwords should be decrypted with the old master password, and re-encrypted with the new master password.
count_ops returns an integer that shows how many times any of the 4 password-protected functions were successfully called (aka didn’t result in an exception) since the creation of the account. get_force should not count towards this integer, but calling this function does increment that value before returning it.
Note that the account must be controlled by a password locking system. Each time the wrong password is used in any function (except get_force), raise the WrongPassword exception. If the WrongPassword exception is raised 3 times in a row, the account becomes locked afterwards; the save, get and count_ops functions subsequently raise the AccountLocked exception independent of their inputs. During such a lock, update_master raises WrongPassword if the current password is wrong, and unlocks the account with the new password recorded if the current password is right. You do not need to check whether the current and new passwords are equal. get_force cannot raise any exceptions.
Here are a few pointers to help you:
Think first of all the references you need. What information must be tracked? What is the type of each such piece of information? We provide you with one such reference, the list in which you should store your address-password pairs.
Then, define each such reference as a let-in inside the make_manager function. You will lose points if any helper variables or functions are not inside make_manager (except, of course, find_map or any Stdlib functions).
Next, write a helper function that verifies that a given master password is correct. This way, you can consolidate the common functionality of testing password locking into one function. You can even consider how this can help with the integer returned in count_ops.
Finally, write the 5 functions composing the record and return the record.
The points for this question will be distributed based on a series of tests with the functions interacting, rather than being distributed a set amount per function composing the record.
Question 3 : Memoization (35 %)
Did you know?
If you have taken COMP 251/252, you may have heard of the technique of dynamic programming. If a problem you are trying to solve can be decomposed into sub-problems which are identical to the original, except being smaller, you can design your code to take advantage of that and work much faster by solving the sub-problems in increasing order. In addition to making great questions for students to solve (such as this problem studied in COMP 321 – Programming Challenges), dynamic programming is extensively used in bioinformatics (sequence alignment), numerical programming (matrix chain multiplication), graph algorithms (Dijkstra’s) as well as many other fields.
Dynamic programming pairs greatly with the concepts of recursion and structural induction studied in this class, as recursive problems naturally use smaller identical sub-problems in their solutions. In particular, we will use memoization, a more generic approach than dynamic programming, in order to improve runtime of recursive functions.
Here is a small example. The Fibonacci numbers follow the equation Fib[n] = Fib[n-1] + Fib[n-2]. Let’s say you wanted to compute Fib[4]. That will require you to find Fib[3] and Fib[2]; and Fib[3] also requires the knowledge of Fib[2]. If we were able to record the result Fib[n] when we compute it once (say, when we look into Fib[3]), and then simply output it instead of re-computing (when we look at Fib[4]), it saves us a lot of computation time.
Question 3, Part 1 (10%)
In the prelude, you are given the function catalan : int -> int . It is given for reference only; you should NOT have any calls to this function in your code. This computes the n’th Catalan number, using the following equations:
C(0)=1;C(n)=∑i=0n−1C(i)⋅C(n−1−i)
For this question, implement the function catalan_count : int -> int * int which, in addition to returning the Catalan number as the first integer in the output pair, also counts how many times we look for some Catalan number C(n). You must use the reference cell count_rec_calls given to you in the template – meaning you need to make an inner helper function that computes the Catalan numbers while incrementing the reference each time it is called.
Question 3, Part 2 (15%)
Now that we understand how slow and inefficient direct recursion of computing Catalan numbers, let us attempt to create a generic, higher-order function that can help us memoize. We will use the Hashtbl data structure in order to record answers for smaller inputs so that when they are given again, we can avoid recomputing.
For this question, implement the function memoize : ((‘a -> ‘b) -> ‘a -> ‘b) -> stats -> ‘a -> ‘b by implementing the inner helper function f’ : ‘a -> ‘b. What it should do is:
Inspect the hash table hash to see if it can find an entry for the key x;
If it finds such a value, increment the lkp field of the stats record and return the value;
If it does not find the value, it uses the f function to find a value. This is then added to the hash table under the key x. Finally, it increments the entries field of the stats record and returns the value.
Note the type of the function f passed to memoize : (‘a -> ‘b) -> ‘a -> ‘b. The trick we use here is that f is not a recursive function; instead, f calls the function g, the first argument to f, in its recursive case. To clarify and present an example, memoize (fun g x -> if x = 0 then 0 else x + g (x – 1)) stats memoizes a function that computes the sum of numbers up to x.
Question 3, Part 3 (10%)
Implement the toplevel function memo_cat : (int -> int) -> int -> int which can be used in conjuction with memoize to produce Catalan numbers. Then, make the toplevel function catalan_m : int -> (int * stats) which creates a brand-new stats reference, uses memoize and memo_cat, and returns the resulting answer and stats record.