Represent the audio signal as a sequence of features, e.g., Mel-frequency cepstral coefficients (MFCCs).
(I) Develop and evaluate a conventional activity recognition system based on using the Gaussian Mixture Model (GMM) – perform the training of the model for each activity with corresponding data.
(II) Develop and evaluate a GMM-UBM system – build a ‘general’ GMM based on data from all activities and then employ maximum a-posteriori (MAP) adaptation using activity-specific data to obtain the model of each activity.
(III) Develop and evaluate a GMM-SVM system – this is based on representing an utterance of recording as a ‘supervector’ consisting of the means of the adapted GMM components and then using support vector machine (SVM) for classification.
(IV) Develop and evaluate an ‘i-vector’-based system – this is based on using the ‘supervector’ representation but then transforming this into an i-vector with reduced dimensionality for classification.
To design and perform experimental evaluations using leave-one-out procedure on a given corpus of audio recordings.