Introduction
You work for an online marketing company that manages a number of store loyalty programs. You have been asked to develop some software to manage the customer database. You decide to develop a prototype solution to test the performance of different data structures and algorithms.
The company manages the program for several stores, but this number is not expected to grow beyond a few dozen. More importantly, each program may have many thousands (perhaps even millions) of customers; and, as a result of aggressive cross-marketing, many of these customers are subscribed to multiple loyalty programs. Your software needs to provide the following functions:
• Subscribe: This function will prompt the user for a customer id and loyalty program name. If this data does not already exist in the database, it will be added. A subscription will then be added for this customer and program.
• Unsubscribe: This function will prompt the user for a customer id and loyalty program name. The function will then remove this subscription but will not remove customers or loyalty programs from the database, even if they have no subscriptions.
• Print Summary: This function will print a line for each loyalty program, with each line containing the name of the program followed by the number of subscribed customers.
• Print Subscribers: This function will prompt the user for the name of a loyalty program, then print an ordered list of customer ids for all subscribers to this program, each one on a separate line.
• Print Subscriptions: This function will prompt the user for a customer id, then print all of the loyalty that this customer is subscribed to.
It is very important that access to data is very fast since many operations may occur each second. This will be especially important when dealing with customers since the number of customers may be very large and insert, delete, and find operations will be very common.
Initially, you decide to test performance using a combination of binary search trees, linked lists and arrays, but you would like to test an AVL tree as well. For preliminary testing, you decide that only the customer ids and loyalty program names will be stored.
Assignment Specification – Part A
For this part of the assignment you will select appropriate linked list, BST, and array data structure to implement the four operations. Part of the assignment task is to use these data structures in an efficient way. Your implementations of list and BST data structures MUST be based on the definitions used in lectures and tutorials as given below.
typedef struct listNode{ int data;
struct listNode *next;
} *ListNodePtr;
typedef struct list {
ListNodePtr head;
} List;
typedef struct bstNode { int data;
struct bstNode *left; struct bstNode *right;
} *BSTNodePtr;
typedef struct bst {
BSTNodePtr root;
} BST;
These data structures will need to be modified to support the customer and loyalty program information, which will be stored as long ints and strings. You will also need to add cross- references between the data structures to represent subscriptions. These may be one-way references – i.e. a customer might have references to the programs they are subscribed to, but the programs might not have references to the subscribed customers, or vice versa.
The ListNodePtr and List definitions and (modified) linked list functions must be placed in files list.h and list.c. The BSTNodePtr and BST definitions and (modified) BST functions must be placed in files bst.h and bst.c.
All remaining code should be placed in a file called main.c that contains the main function and program logic. This file will contain separate functions for each of the four operations listed in the introduction, as well as a fifth function to handle termination. Other functions may be added if required.
Program I/O
All interactions with the program will be via the console. Operations 1-4 will be selected by typing 1-4 at the command prompt. Quitting the application will be selected by typing 0. For example, the following input sequence would subscribe the customer ‘123’ to the loyalty program ‘abc’, and then quit:
1
123
abc 0
Note that this sequence shows the input only, not the program response (if any). You are free to add prompts to make the application more user friendly, but this will not be assessed (although it may be useful).
Program output in response to operations 3 and 4, should be as minimal as possible. You may print a header if you wish, but this should be followed by one record per line with spaces separating data. For example, in response to operation 3, the output might be:
Current subscriptions: abc 32
def 0
ggg 10236
I/O Restrictions
You may assume that all input will always be in the correct format and contain no logical errors.
• Commands will always be in the range 0-4
• Loyalty program names will always be strings less than 100 characters long and may contain any alpha-numeric characters excluding spaces
• Customer ids will always be positive integers in the range 0-999999999
• The user will never attempt to print data for a non-existent loyalty program
• The user will never attempt to print data for a non-existent customer
Memory Management
Loyalty program names should be stored in appropriately size dynamically allocated memory. Names will always be less than 100 characters long. For example, the program name “abcdef” would be stored in a char string of length 7.
Unsubscribing from a program may require you to free memory for that subscription (but not for the customer or program itself). The quit function should free all dynamically allocated memory.
Solution Requirements
You must develop a reasonably time efficient solution, but you also want to keep the code clean and simple to make your code easier to modify and maintain (a computer scientist called Donald Knuth, famously said that “premature optimisation is the root of all evil”).
Remember that, while the number of customers may be large, the number of loyalty programs is expected to be relatively small.
So, to score full marks your solution must:
• utilise at least one linked list and at least one bst
• strings must be stored in appropriately sized dynamically allocated memory
• the memory for all data structures should be freed when the data is no longer required
• If the number of customers is C and the number of loyalty programs is L then:
• Op 1 should be O(L + log C)
• Op 2 should be O(L + log C)
• Op 3 should be O(L * C)
• Op 4 should be O(L + C)
• Op 5 should be O(L * log C)
Note: A clever implementation might be able to achieve a better Big O for some of these operations, but that is not required to achieve full marks on this assignment. You will not receive extra marks for an implementation that achieves better performance.
Also note that the following work would get close to a pass mark:
• BST and linked list code implemented correctly as per tutorial work
• Evidence that you have made a sensible choice about how to use the data structures (this could be determined from code comments or variable names, that make it clear what choices you have made)
• BST and linked list code adapted to store customers and loyalty programs (node definitions only – functions don’t need to be correctly adapted)
• A working main loop (calls functions when options are selected, functions collect required input, but may not do anything with that input, exits without errors)
• A solid attempt to get BST and linked list functions working in this context (with the main loop and adapted data structures)
It is best to submit a program that compiles and runs without errors, even if the program doesn’t do anything. If you have added new code that breaks you program, comment this code out before submitting (and add an explanation) so that the marker can easily see what you have completed. After running your program, the marker will then check the code to determine what else you have attempted. Code that doesn’t compile or crashes is extremely difficult to mark, and the marker may be forced to give you less marks if they are unable to confirm what is working (they have very limited time to fix bugs).
Assignment Specification – Part B
This part of the assignment should only be attempted once you have a fully implemented and thoroughly tested solution to part A. It would be better to submit a complete part A and no part B than to submit a partially complete part A and part B. (If you look at the marking sheet below, you will see that this part is worth at most 15%)
The requirements for this part of the assignment are exactly the same as for part A except that an AVL tree must be used, rather than a BST. The AVL code should be placed in bst.h and bst.c, with AVL functions added to these two files (you may also need to modify the node definition, but this should have no effect on the BST functions).
Minimal assistance will be provided for this part of the assignment, especially if you cannot demonstrate a fully implemented and thoroughly tested solution to part A.
Testing
It can be very time consuming to thoroughly test a program like this when all input is done manually (imagine testing that your solution can manage 10000 customers). A common method of testing code like this is to use input redirection (and possibly output redirection).
When using input redirection your code runs without modification, but all input comes from a file instead of from the keyboard.
This facility is provided in Visual Studio through the project properties dialog. For example, to redirect input from a file called “test.txt”, you would add:
<"$(ProjectDir)test.txt"
to Configuration Properties|Debugging|Command Arguments. This will be demonstrated in tutorials.
At least one small input test file with sample output will be provided. It is recommended that you construct larger files to fully test your program.