Dept. of Computer Science and Engineering
Oregon Graduate Institute of Science & Technology
Formants are the resonant frequencies of the vocal tract. As the vocal tract is moved to different positions to produce different sounds, there is a corresponding change in the formant frequencies. Estimates of formant frequencies for the lowest three formants can give important information about the phoneme produced. Change in the vocal tract position causes the formant frequency ranges to overlap. We investigate the ability of neural network classifiers to learn important distinctions between the formants, and to assign the appropriate formant labels. We used both spoken letters of the English alphabet and continuous speech. Our back propagation network uses conjugate gradient optimization. We first experimentally determined the best feature set, influenced by the features used by human labelers. Then we experimentally determined the best representation of those features, and network configuration. Representation questions include feature derivation, and absolute or relative indexing of location. Configuration questions include network size, and presentation and labeling of the feature vectors. We compare the performance to other published algorithms and human performance. This system also compares favorably to both.
Rooker, Terry, "Formant estimation from a spectral slice using neural networks" (1990). Scholar Archive. 151.