PowerPoint Presentation
Prof. , School of EE&T Term 3, 2022
Copyright By PowCoder代写 加微信 powcoder
ELEC3104: Mini-Project – Cochlear Signal Processing
ELEC3104: Project Outline
✓ This mini project (individual) will focus on understanding and modelling the spectral analyses carried out by
the human cochlea.d modelling the spectral analyses carried out by the human cochlea.
TLT – Level 1
▪ Introduction to Human Auditory System and MATLAB coding fundamentals
TLT – Level 2 (Pass Level)
▪ Implementation of a cascaded filter bank model of the cochlea for analysis purposes.
TLT – Level 3 (Credit Level)
▪ Implementation of a cascaded filter bank model of the cochlea for spectral analysis.
TLT – Level 4 (Distinction Level)
▪ Implementation of a cascaded filter bank model of the cochlea for for pitch detection of a speech signal.
TLT – Level 5 (High Distinction Level)
▪ Incorporate mechanisms into the cascaded cochlear model that makes the cascaded filter bank adaptive.
Additional Information: In addition to the information provided to you in these slides, you are strongly encouraged
to find and view animations and videos that describe the functioning of the peripheral auditory system and the
cochlea in particular. Visualisation in the form of these animations will be very helpful in understanding cochlear
signal processing.
Eg: Cochlear Animation – https://www.youtube.com/watch?v=dyenMluFaUw
Prof. , School of EE&T Term 3, 2022
ELEC3104: Mini-Project – Cochlear Signal Processing
TLT – Level 1: Introduction to Human Auditory System
and MATLAB coding fundamentals
Introduction to the Human Auditory System
✓ The human auditory system is responsible for
converting pressure variations caused by the
sound waves that reach the ear into nerve
impulses that are interpreted by the brain.
✓ The Human Auditory System is designed to
assess frequency (pitch) and amplitude
(loudness).
✓ The peripheral auditory system is divided into
the Outer Ear, Middle Ear, and Inner Ear.
✓ The peripheral auditory system and in particular
the cochlea can be viewed as a real-time
spectrum analyser.
✓ The primary role of the cochlea is to transform
the incoming complex sound wave at the ear
drum into electrical signals.
✓ The human ear can respond to minute pressure
variations in the air if they are in the
audible frequency range, roughly 20 Hz – 20 kHz
Electrical
Sounds Level
Faint 20dB (A faint Whisper is 30dB)
Soft (Quiet) 40dB
Moderate 60dB (normal conversation)
Loud 80dB (alarm clocks, vacuum cleaners)
Very Loud 90dB(Blenders);110dB (Concerts, car horns)
Uncomfortable 120dB (jet planes during take off)
Painful and
130dB(Jackhammers); 140dB(Gunshots)
*Use hearing protection
✓ Over 85 dB for extended periods can cause permanent hearing loss
✓ Zero decibels (0 dB) represent the absolute threshold of human
hearing, below which we cannot hear a sound.
Outer Ear (Air Vibration): A resonator
✓ The pinna surround the ear canal and functions
as sound wave reflectors and attenuators .
✓ The sound waves enter a tube-like structure
called ear canal and it serves as a sound
amplifier.
✓ The sound waves travel through the canal and
reach the eardrum and cause it to vibrate
✓ The length (L) of the human ear canal is 2.8 cm
(and 7 mm in diameter)
✓ Speed of sound (c) = 340.3 m/sec ;
✓ The resonant frequency (f) of the canal is =
= 3,038Hz.
✓ The human outer ear is most sensitive at about
3kHz and provides about 20dB (decibels) of
gain to the eardrum at around 3000Hz.
Outer ear is a low-Q bandpass filter
(Representative figure only)
Middle Ear: An Impedance Matcher & an Amplifier
✓ Middle ear transforms the vibrating motion of
the eardrum into motion of the stapes via the
two tiny bones, the malleus and incus .
✓ The pressure of the sound waves on the oval
window is around 25 times higher than on the
✓ Since the sound Intensity (𝐼) is proportional (∝)
to the square pressure (𝑃2) , the sound intensity
increases 625 times (or 28dB)
✓ Middle ear converts acoustic energy to
mechanical energy and mechanical energy to
hydraulic energy
Outer Ear Middle Ear
The combined frequency response of the outer
and middle ear is a band-pass response, with
its peak dominated near 3 kHz
Middle Ear Gain function
✓ The inner ear consists of the cochlea responsible for converting the
vibrations of sound waves into electrochemical impulses which are
passed on to the brain via the auditory nerve.
✓ The cochlea is a spiral shaped structure which is about 3.5 cm in length
if uncoiled.
✓ The cochlea is divided along its length by the basilar membrane (BM)
which partitions the cochlear into two fluid canals (scala vestibuli and
scala tympani).
✓ The BM terminates just reaching the helicotrema, so there is a passage
way between the scala vistibuli and the scala tymapni equalising the
difference in pressure at the ends of the two scalas.
A longitudinal
section of an
uncoiled cochlea
Basilar Membrane (Hydro Dynamical process)
✓ The Basilar Membrane varies in width and stiffness along
its length.
✓ At basal end it is narrow and stiff where as towards the
apex it is wider and more flexible.
✓ Each point along the basilar membrane has a
characteristic frequency, 𝑓𝑝(𝑥), to which it is most
responsive.
✓ The maximum membrane displacement occurring at the
basal end for high frequencies (20 kHz) and at the apical
end for low frequencies (70Hz) .
✓ When the vibrations of the eardrum are transmitted by
the middle ear into movement of the stapes, the
resulting pressure differences between the cochlear fluid
chambers, generate a travelling wave that propagates
down the cochlea and reach maximum amplitude of
displacement on the basilar membrane at a particular
point before slowing down and decaying rapidly
✓ The location of the maximum amplitude of this travelling
wave varies with the frequency of the eardrum
vibrations
Basilar membrane
0.05155 cm
If 𝑥 is the distance of a point on the basilar membrane from
the stapes, then the frequency, 𝑓𝑝(𝑥), that produces a peak
at this point is given by:
𝑓𝑝 𝑥 = 20000.0 10
−0.667 𝑥 𝐻𝑧 0 ≤ 𝑥 ≤ 3.5 𝑐𝑚
• It is evident that a 20 kHz tone at the stapes will cause the
BM to vibrate at a point 𝑥 = 0.
• A 70 Hz tone will excite the BM at a point x = 3.5 cm (i.e. at
The basilar membrane is a resonant structure that vibrates, vertically in
sympathy with pressure variations in the cochlear fluid.
Basilar Membrane
✓ Different frequencies stimulate different areas of the
basilar membrane
✓ When a tone (single sinusoid) is applied, the
cochlear fluid oscillates in phase with the stimulating
frequency causing a travelling wave pattern of the
vibration on the basilar membrane
✓ There will be one place where the resonant
frequency of the membrane matches the stimulus
frequency and this place will show the maximum
amount of vibration
✓ By measuring vibration at particular points on the
membrane for a range of stimulus frequencies we
can plot the frequency response of each place on
the membrane
✓ The essential function of the basilar membrane is to
act as a frequency analyser (a set of band-pass filters
each responding to a different frequency region)
resolving an input sound at the eardrum into its
constituent frequencies
Organ of Corti
✓ Attached to the basilar membrane and running its entire
length is the organ of corti containing some 30,000 sensory
hair cells.
✓ The hairs (cilia) of these cells stick up from the organ of corti
and are in contact with overlying Tectorial Membrane
✓ There are two types of sensory hair cells:
▪ One row of inner hair cells, whose cilia float freely in the
fluid-filled region called subtectorial space
▪ Three rows of outer hair cells whose cilia are attached to
the tectorial membrane
✓ Most of the afferent fibres (neurons which carry signals to
the brain) come from inner hair cells,
✓ The efferent fibres (which receive signals from the brain) go
mainly to outer hair cells.
✓ When the basilar membrane deflects, due to pressure wave
in the cochlear fluid, the tectorial membrane move and
shear which causes the hairs of the outer hair cells to bend
and also cause the fluid flow in the subtectorial space.
✓ This in turn triggers the inner hair cells to transmit nerve
impulses along the afferent fibres and eventually to brain.
✓ The motion of each part of the basilar membrane as
detected by the inner hair cells is transmitted as neural
description to the brain. A simplified Diagram of a Human Auditory System
SOUND BASILAR
INNER HAIR
HAIR CELLS
PROCESSING
Inner Ear (Cochlea)
Nerve Fibres
Organ of Corti
Mechanical to Neural Transduction (Electro Chemical)
✓ The mechanical displacement to electrical energy
transduction process takes place in the inner hair
✓ Bending of the inner hair cell cilia due to basilar
membrane displacement produces a change in the
overall resistance (reduces it) of the inner hair cell,
thus modulating current flow through the hair cell.
✓ The modulation being directly proportional to the
degree of bending of the cilia and the bending of
the cilia is one direction only; in effect a half wave
rectification of the basilar membrane displacement
takes place.
✓ Bending of the cilia releases neurotransmitter
which passes into synapses of one or more nerve
cells which fire to indicate vibration
✓ The amount of firing is thus related to the amount
of vibration
✓ Since the neurotransmitter is only released when
the cilia are bent in one direction, firing tends to be
in phase with basilar membrane movement
Here bending the inner hair cell cilia is simulated by
charging of the capacitor and returning to the initial
position of the cilia is equivalent to discharging the
capacitor.
Inner Hair Cell model
Mechanical to Neural Transduction (Electro Chemical)
✓ The model inner hair cell is a capacitor model, in which the input voltage corresponds to the spatially differentiated membrane
displacement output of the auditory model. (Second part of Figure below). Here bending the inner hair cell cilia is simulated by
charging of the capacitor and returning to the initial position of the cilia is equivalent to discharging the capacitor.
✓ Spatially differentiation refers to taking the derivative with respect to position (along the basilar membrane) and a discrete model is
𝑑 𝑖 + 1 − 𝑑 𝑖
where, 𝑑 𝑖 is the displacement at the 𝑖𝑡ℎ section along the membrane and Δ𝑥𝑖 is the width of the 𝑖
𝑡ℎ section
✓ Spatial differentiation of the membrane displacement represents coupling between the cilia of the inner hair cells , through the fluid
in the subtectorial space ( high-pass filter effect, first part of Figure below)
✓ You will implement a digital model for neural transduction in TLT Level 2 in addition to the transmission line auditory model
Cochlear Modelling: Cascade and Parallel Models
✓ The basic model of the cochlea is a transmission line model
(cascade model) in which the basilar membrane is modelled as a
cascade of 128 low pass filters, notch filters and resonators as
shown above.
✓ Each digital filter section in the model above represents a section
of the basilar membrane (tuned to a specific frequency) with 128
sections representing the entire basilar membrane
Magnitude response
of the cascaded filter
bank model
✓ The peripheral auditory system is often
modelled as a bank of 128 bandpass filters
(auditory filters) with overlapping passbands.
✓ Typically modelled using a finite number of
bandpass filters, equally spaced along the
Basilar Membrane.
Magnitude response of the parallel filter bank model
Prof. , School of EE&T Term 3, 2022
ELEC3104: Mini-Project – Cochlear Signal Processing
TLT – Level 1: Learning Activities (MATLAB Coding)
Learning Activity 1: Modelling the Outer Ear and the Middle Ear
✓ The middle ear may be modelled as a cascade of two complex pairs of zeros (to remove very high and very low
frequencies) and one complex pair of poles (to provide low-Q gain at the middle frequencies). The approximate frequency
response of the middle ear can be seen in the figure below.
Assuming a sampling frequency of 16kHz:
(a) Obtain the transfer function of the middle ear filter, by suitably placing poles and zeros on the z-plane. Verify your
results in MATLAB.
(b) Using placement of poles and zeros, estimate a model for the outer ear and cascade it with your previous model of
the middle ear and show using MATLAB that the overall response matches the one shown in this figure.
The combined frequency response of the outer
and middle ear is a band-pass response (sum –
see the adjacent magnitude response diagram),
with its peak dominated near 3 kHz
Filter Design: Pole zero placement
✓ Calculate the digital filter coefficients of the resonant pole and resonant zeros using pole zero
placement (e.g. : see diagram below)
✓ Resonant pole frequency = 𝜃𝑝; radius = 𝑟𝑝; 𝜃𝑝 =
; 𝑓𝑠 = 16𝑘𝐻𝑧 (or higher)
✓ Resonant zero frequency = 𝜃𝑧; radius = 𝑟𝑧 (𝑟𝑧 > 𝑟𝑝 and closer to unit circle); 𝜃𝑧 =
𝑧2−𝑟𝑝 2 cos 𝜃𝑝 𝑧+𝑟𝑝
1−2𝑟𝑝 cos 𝜃𝑝𝑧
(from one section of the digital filter) – Equating to above, we obtain
𝑏1 = 2𝑟𝑝 cos 𝜃𝑝 and 𝑏2 = 𝑟𝑝
✓ Similarly, 𝑎1 = 2𝑟𝑧 cos 𝜃𝑧 and 𝑎2 = 𝑟𝑧
2 for 𝐻𝑧 𝑧 = 1 − 𝑎1𝑧
✓ Both transfer functions can be normalised such that DC gain = 1 as follows:
and 𝐻𝑧 𝑧 =
✓ 𝑟𝑝 and 𝑟𝑧 can be calculated approximately as follows:
𝜋 ; 𝑟𝑧 ≈ 1 −
;Q-factors: 𝑄𝑧 =
Pole – zero plots and magnitude responses of the outer ear
% Outer ear implementation – Learning activity 1
% Using the magnitude response (dotted line) given in slide 15
% You can observe that the magnitudes at 0 Hz and 10 kHz are
closer to zero.
% Hence, we need to place zeros at real axis and complex zeros
closer to 10 kHz.
% Therefore we choose the sampling frequency at 20 kHz.
% You may notice that there is a peak around 2 kHz.
% Therefore, we need to place a complex conjigate pair
(causing a peak) –
% – at 2 kHz (approximately 0.8 + 0.5i and 0.8 – 0.5i).
fs = 20*10^3; % sampling frequency
zero_1 = 0.8;
zero_2 = 0.8;
zero_3 = -0.9 + 0.1i;
zero_4 = -0.9 – 0.1i;
pole_1 = 0;
pole_2 = 0;
pole_3 = 0.8 + 0.5i;
pole_4 = 0.8 – 0.5i;
% convert the poles and zeros to numerator and
denominator polynomials
zeros_outer_ear = [zero_1 zero_2 zero_3 zero_4];
poles_outer_ear = [pole_1 pole_2 pole_3 pole_4];
b = poly([zero_1 zero_2 zero_3 zero_4]);
a = poly([pole_1 pole_2 pole_3 pole_4]);
% pole-zero plot and magnitude response
sgtitle(‘Outer ear implementation’);
subplot 211
zplane(b,a);
title(‘Pole-Zero Plot’);
subplot 212
n = 1024; % FFT points
[H,w] = freqz(b,a,n);
mag_db = 10*log10(abs(H));
% plot the x axis in log scale
semilogx(fs/2*(w/w(end)),mag_db);
title(‘Magnitude response’);
ylabel(‘Magnitude (dB)’);
xlabel(‘Frequency (Hz)’);
Pole – zero plots and magnitude responses of the outer ear
Pole – zero plots and magnitude responses of the middle ear
% Middle ear implementation – Learning activity 1
zero_1 = 0.95;
zero_2 = 0.95;
zero_3 = -0.4+0.1i;
zero_4 = -0.4-0.1i;
pole_1 = 0;
pole_2 = 0;
pole_3 = 0.9+0.3i;
pole_4 = 0.9-0.3i;
% convert the poles and zeros to numerator and –
% -denominator polynomials
zeros_middle_ear = [zero_1 zero_2 zero_3 zero_4];
poles_middle_ear = [pole_1 pole_2 pole_3 pole_4];
b = poly(zeros_middle_ear);
a = poly(poles_middle_ear);
sgtitle(“Middle ear implementation”);
subplot 211
zplane(b,a) % plot zplane of middle ear model
title(‘Pole-Zero Plot’);
% compute freq. resp. of middle ear model
n = 1024; % FFT points
k0 = 20; % gain factor
[H, w] = freqz(k0*b,a,n);
subplot 212
% plot the x axis in log scale
semilogx(fs/2*w/w(end),10*log10(abs(H)));
xlabel(‘Frequency (Hz)’)
ylabel(‘Magnitude (dB)’)
title(‘Approximate mag. res. of middle ear’);
Pole – zero plots and magnitude responses of the middle ear
% Combine Outer ear and middle ear implementation –
Learning activity 1
zeros_combined = [zeros_outer_ear zeros_middle_ear];
poles_combined = [poles_outer_ear poles_middle_ear];
b = poly(zeros_combined);
a = poly(poles_combined);
sgtitle(“Outer ear & Middle ear implementation”);
subplot 211
zplane(b,a) ;% plot zplane of middle ear model
title(‘Pole-Zero Plot’);
% compute freq. resp. of middle ear model
n = 1024; % FFT points
k0 = 100; % gain factor
[H, w] = freqz(k0*b,a,n);
subplot 212
% plot the x axis in log scale
semilogx(fs/2*w/w(end),10*log10(abs(H)));
xlabel(‘Frequency (Hz)’)
ylabel(‘Magnitude (dB)’)
title(‘Approximate mag. res. of outer & middle ear’);
Combined magnitude responses of the outer & middle ear
Combined magnitude responses of the outer & middle ear
Learning Activity 2 : Impulse and Magnitude Responses
✓The impulse response of an auditory filter can be modelled by:
𝒈 𝒏 = 𝒌 𝒏𝑻 𝑵−𝟏𝒆−𝟐𝝅𝒃 𝟐𝟒.𝟕+𝟎.𝟏𝟎𝟖𝒇𝒑 𝒏𝑻𝒄𝒐𝒔 𝟐𝝅𝒇𝒑𝒏𝑻
where, 𝑓𝑝 is the centre frequency, 𝑇 is the sampling period (𝑓𝑠= Τ
𝑇 ) , 𝑛 is the discrete time sample index, 𝑁 is the order of the
filter (𝑁 = 4) and 𝑎 is a constant chosen such that the filter gain at the centre frequency is 0dB ; 𝑏 = 1.14; 𝑓𝑠= 16,000 Hz.;
[Initially, you may choose a=1 and then change the value such that the gain of the filter is normalised to 0dB at the centre
frequency, 𝑓𝑝 .
✓You are required to calculate the impulse response, 𝑔(𝑛), for four auditory filters of your choice from the low, mid and high
frequency regions of the basilar membrane using the equation {𝑓𝑝(x)} given below in MATLAB.
𝒇𝒑 𝒙 = 𝟖𝟎𝟎𝟎. 𝟎 𝟏𝟎
−𝟎.𝟔𝟔𝟕 𝒙 𝑯𝒛 𝟎. 𝟎𝟖𝟔𝟗 𝒄𝒎 𝟖𝟎 𝑯𝒛 ≤ 𝒙 ≤ 𝟐. 𝟗𝟗𝟖𝟓 𝒄𝒎(𝟕 𝒌𝑯𝒛)
You will notice that the impulse responses have infinite duration, and thus each impulse response will need to be truncated to,
say, 150 to 200 coefficients (i.e., 0 ≤ 𝑛 < 200).
✓Plot the impulse responses of all four filters.
✓Plot the magnitude responses of all four filers.
✓Plot the centre frequency, bandwidth and Q factor for all filters.
Discuss your plots with your lab demonstrator.
Learning Activity 2 – MATLAB code
% Impulse and magnitude response calculation of an auditory
clc; clear all; close all
num_filter = 128; % nubmer of filters
NFFT=1024; % number of FFT points
% 2.9985 and 0.0869 are xmax and xmin
delta_x = (2.9985-0.0869)/(num_filter-1);
k = 128:-1:1; % filter index
fp = 8000*10.^(-0.667*k*delta_x); % centre frequencies
% Auditory filter parameters
T=1/fs; % sampling period
N=4; % order of filter
n=0:199; % sample index
% filter's impulse response
for i=1:num_filter
g(:,i)=((n*T).^(N-1)).*exp(2*pi*b*(24.7+0.108*fp(i))*n*T).*...
cos(2*pi*fp(i)*n*T);
G=fft(g, NFFT); % filter's frequency response in [0 fs]
G=abs(G(1:NFFT/2,:)); % filter's magnitude response in [0 fs/2]
for i=1:num_filter
% normalize all the impulse response max to 1
g(:,i)=g(:,i)/max(abs(G(:,i)));
G=fft(g, NFFT); % normalised filter's frequency response in [0
% normalised filter's magnitude response [0 to fs/2] in dB
G = 20*log10(abs(G(1:NFFT/2,:)));
% est. filter's bandwidth
% i.e, width of freq. region where filter's gain > -3dB
freqHz = fs*(0:NFFT-1)/NFFT; % frequency axis [0 fs]
freqHz = freqHz(1:NFFT/2); % frequency axis [0 fs/2]
for i=1:num_filter
% find frequency index in passband region of filter
pass_band_freqID = find(
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com