EEEM030 Assignment 2: Speech recognition
Assessment: 10% of module
1. Introduction
The second coursework assignment for EEEM030 Speech & audio processing & recognition is designed to give you an opportunity to practice the algorithms that are the key to training, which are essential for any speech recognizer. In doing so, you will also get to implement your own machine learning code, and to observe the effects of training.
You are asked to submit a written report (1000-1500 words plus an appendix, in PDF format) to the EEEM030 assignment folder in SurreyLearn by 4pm Tuesday 8th January 2019 (week 12) deadline. This should contain a brief description of your method for approaching the task, details of the implementation (including calculations), an analysis of the results (containing obtained parameter values, graphs of pdfs, and a summary discussing the outcomes) and references. The appendix should contain the results of intermediate calculations and program code implemented by yourself.
As with all formally assessed coursework, you may discuss the concepts associated with the coursework with your peers but not the details of any solution that you implement. In line with University policy, you may use other sources, such as text books, lecture notes, articles, online tutorials and code libraries, but failure to cite them correctly may be viewed as plagiarism and trigger an academic misconduct investigation. So please reference all your sources carefully!
2. Model
You are asked to perform various calculations using the parameters of a continuous-density hidden Markov model (CD-HMM) with N=3 emitting states. Initial values of the state-transition probability matrix and the output probability density functions (pdfs) are given below to two decimal places in Tables 1 and 2, where μi denotes the one-dimensional (1-D) mean and Si the variance of each univariate (1-D) Gaussian distribution (i.e., with K=1).
Table 1: Matrix of state-transition probabilities, incorporating the entry, state-transition and exit probabilities, A = {pi, aij, hi}.
Table 2: B matrix of parameters defining the states’ output probability densities bi(ot).
The sequence of 1-D observations to be used for your calculations with the above model is eight time frames long: O = {1.3, 2.3, 2.8, 3.3, 5.0, 5.6, 4.9, 5.9} for t=1..T with T=8.
0
0.93 0.07 0
0
0 0 0
0.84 0.11 0.05 0 0.88 0.08 0 0 0.91
0 0.04 0.09
0
000
0
State i
1
2
3
Mean μi Variance Si
1.90 0.16
3.40 0.81
5.10 0.25
3. Task
Essentially, this assignment involves performing calculations to train the hidden Markov model according to Expectation-Maximization using the Baum-Welch equations. Initial calculations are performed with the given 1-D observation sequence for all N=3 emitting states and all T=8 time frames, to yield tables of likelihoods. These are used to train the model. The steps are as follows:
1. Draw the state topology and depict graphically the pdfs bi for each state i=1..N over the observation space.
2. Evaluate the output probability densities bi(ot) for states i=1..N at each of the time frames t=1..T; add these points to your graphs from step 1.
3. Using the forward procedure, calculate the forward likelihoods at(i) for i=1..N and t=1..T and overall likelihood of the observations p(O|l).
4. Calculate the backward likelihoods bt(i) for i=1..N and t=1..T; confirm p(O|l)’s value.
5. Using results from steps 3 and 4, calculate the occupation likelihoods gt(i) for i=1..N and t=1..T.
6. Re-estimate the mean 𝜇̂# and variance Σ%# for i=1..N, using the occupation likelihoods, observations and, for the variance, the previous value of the mean μI for each state.
7. Plot the pdfs after this training iteration using your re-estimated 𝜇̂# and Σ%# values, and provide comments on the key similarities and differences.
8. Similar to step 5, calculate the transition likelihoods xt(i,j) for i=1..N+2, j=1..N+2 and t=1..T.
9. Re-estimate the state transition matrix 𝐴’ = {pi, aij, hi} for i=1..N+2 and j=1..N+2, using the transition likelihoods.
You may use any appropriate software package to implement these calculations, such as Python, Matlab or Excel. The equations should be coded or calculated from scratch for this assignment. If you re-use an existing HMM implementation, such as a toolbox, toolkit or downloaded software, you must highlight the lines of code that have been contributed by others and provide the appropriate citation to the source in your references.
4. Assessment criteria
Your assignment mark will be given based on the accuracy of your calculated results and the quality of your report, including method, graphs, analysis and commentary. Specifically, marks will be given based on the quality and correctness of the following items:
20% Plots of state topology and output pdfs
5% Output probability density for each time frame and state
15% Forward likelihoods and overall likelihood of the observations 10% Backward likelihoods
5% Occupation likelihoods
15% Re-estimated means and variances
10% Plots of the pdfs and comments
10% Transition likelihoods
10% Re-estimated A matrix
PJBJ, November 2018