Speech Processing Tutorials

posted Mar 4, 2018, 6:30 PM by MUHAMMAD MUN`IM AHMAD ZABIDI   [ updated Mar 4, 2018, 9:04 PM ]

MFCC Procedure

"Psychophysical studies have shown that human perception of the frequency content of sounds, either for pure tones or for speech signals, does not follow a linear scale. This research has led to the idea of defining subjective pitch of pure tones. Thus for each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called ''Mel'' scale. As a reference point, the pitch of a 1kHz tone, 40 dB above the perceptual hearing threshold, is defined as 1000 mels. Other subjective pitch values are obtained by adjusting the frequency of a tone such that it is half or twice the perceived pitch of a reference tone (with a known mel frequency). Feature extraction based on Mel Frequency Cepstral Coefficients (MFCC) utilizes the filter bank of which center frequency and bandwidth are scaled by subjective measure, Mel."

The Mel filterbank