CS4185/CS5185 Multimedia Technologies and Applications
Course Project
• Objectives
The objectives of this assignment are for students to have some hands-on experiences of multimedia programming and to develop a multimedia application. Students are given an image retrieval program written using C++/OpenCV, and are asked to extend this program to provide additional features. Alternatively, you may choose to implement this course project based on python (deep-learning). If so, please make sure that you should do the implementation yourself. You will receive a zero mark if you use someone else’s implementation. However, it is fine to use common library functions such as pillow.
• Requirements of the Assignment
This assignment can be carried out as individual or group projects. The maximum number of members in each group is 3. With more students in a group, a slightly more work and better results are expected, and the responsibility of each group member needs to be clearly indicated in the report.
In the assignment, students are given an OpenCV-based image retrieval program. The package includes an image database of 1000 images, categorized into 10 groups of images, with each group having 100 images. The package also includes 7 test images for you to test your algorithms as you work on the basic requirements. The package also includes a program, implemented with a naïve image matching algorithm to retrieve the best matched image from the image database given an input image. However, the naïve image matching algorithm is not very accurate. It can only find correct matches from the database for 2 of the 7 test images. In other words, the given algorithm would not be able to retrieve correctly matched images from the database for all 7 test images. In this project assignment, you are asked to improve the matching accuracy of the given algorithm by implementing additional matching criteria, and extend it to include additional features.
There are two levels of requirements for the project, basic and advanced, to cater for students of different backgrounds and interests. The basic requirements are designed for all the students to practice some multimedia programming skills. The advanced requirements are for those students who would like to go further to create an application, and are more flexible in terms of what you would like to do. The basic requirements and advanced requirements account for 80% and 25%, respectively, of the grade for this assignment. If a student gets a full mark, i.e., 105, the final mark will be clipped to 100.
• Basic Requirements (80%)
Students are required to finish all of the following tasks in the basic requirements:
• Improve the number of correctly matched images (20%)
The original program can only find correct matches for 2 of the 7 test images. It finds wrong matches for the other 5 images. Modify the program so that it can find correct matches for more test images. (3 gets 5% of marks, 6 gets 20% of marks, etc.)
• Modify the above program to retrieve similar images (20%)
Given a similarity threshold value, the program will return a list of images with similarity values higher than the given threshold. The program will save these images to a new folder.
• Improve on the Precision (20%)
The target of this requirement is to achieve an average of 60% retrieval precision in requirement (2) for the same 5 test images with correct matches in requirement (1). This means that given a test image, the program will return some matched images. Among these returned images, at least 60% of them are correctly matched. (40% precision gets 5% of marks, 60% precision gets 20% of marks, etc.)
• Improve on the Recall (20%)
The target of this requirement is to be able to retrieve an average of 60% of the relevant images in the database. Note that the recall percentage is the average percentage of the same 5 test images with correct matches in requirement 1. (40% recall gets 5% of marks, 60% recall gets 20% of marks, etc.)
• Advanced Requirements (25%)
Students are expected to extend the program into an application. The extension includes two parts, technical improvement and UI design. The technical improvement may include new retrieval algorithms (80+% of precision and recall gets 15% of marks), high dimensional data indexing (efficiently storing and managing the features extracted from the database, modifying the program so that it does not need to compute the features every time), retrieval algorithms for particular types of images (e.g., sunset images, images containing human faces), a crawler to obtain images from the internet, or adding semantic information to help improve the retrieval performance. Here, 15% of marks will be given based on the technical difficulties and another 10% will be given based on the UI design.
3. Marking
The course work component contributes 40% of the final course mark/grade. Within this 40%, I will choose the best combination from one of the following for each of you:
• 15% for assignment, 25% for quiz
• 20% for assignment, 20% for quiz
• 25% for assignment, 15% for quiz
Note that we will use a PC with the following configurations to grade the course projects:
• Windows with Visual Studio 2017 or 2019 (need to use the corresponding demo programs)
• OpenCV 2.4.13
• If you choose to use pyphon, please state it in the readme file and the report. Please attach the necessary files so that we may be able to compile and run your code.
Unfortunately, we do not have a Mac to grade the course projects. I understand that SCM students may not have a PC for the course project. I have asked cslab to install the above tools in all the PCs in room MMW2410 in the cslab. So, you may use those PCs for your course project, if you like. Submission Details
There are two submission deadlines:
• October 25, 2020 – basic requirements
• November 22, 2020 – advanced requirement
By each submission deadline, each group needs to submit the following sub-directories as a zip file on Canvas. If your zip file is too large, you may submit a link where we can download your zip file.
/Program:
• A source sub-directory containing all the source files and the necessary files.
• A binary subdirectory containing the executable file of the program and relevant files, including image files or libraries. The executable file should output the retrieved results (e.g. the list of the retrieved images), precision and recall rates. Note that it is important to make sure that we may just click on the executable file to run the program. So, you may need to try the executable file on a different machine before you submit the work.
• A readme file with instructions on how to compile and execute the program.
/Demo:
A demo video that guides the marker through the main contributions of the work.
/Report:
The purpose of this report is to describe the main contributions of the work. In the first submission deadline, this report should describe your work on the basic requirements only. In the second submission deadline, this report should be extended to cover the advanced requirements also. Note that we will not be marking on the report itself. Instead, the report should show us what you have done so that we may grade the work appropriately. Hence, there is no need to submit a large report. It can just be a few pages providing the following information:
• A cover that indicates your name(s) and student ID(s)
• A brief description of the revised program, including the main modules and the relationship of these modules. (The description may be in the form of short paragraphs or a flow diagram.)
• A list of features added to the original program, including the names of the modified modules (in reference to point (2) above), brief explanations, and screen captures of the results.
• Responsibilities of each group member (if applicable), including
• The programmer of each added function
• The author of each major section of the report
• The person who has done the survey, group coordination, etc.
Note that your submission must contain the above three items. Marks will be deducted if any is missing.