Περίληψη σε άλλη γλώσσα
The present doctoral thesis aims towards the development of new analysis techniques for highly disturbed voices such as the pathologic ones. Most of the currently available analysis schemes incorporate methods developed for normal or healthy voicing conditions. However, the existence of severe jitter, shimmer, noise and other forms of disturbances render these methods inappropriate for the analysis of pathological voices. The thesis outline is as follows. In Chapter 1, the most important physiological and morphological elements of speech production are analyzed. Functional alterations of these elements due to pathological origins, lead to the introduction of severe disturbances of glottal induced noise, jitter and shimmer in the radiated speech signal. Under this view, analysis of disordered speech consists of: 1) estimation of the parameters of the Vocal Tract and speech signal inversion into the glottal excitation, 2) derivation of speech and glottal disturbances, and 3) selection of ...
The present doctoral thesis aims towards the development of new analysis techniques for highly disturbed voices such as the pathologic ones. Most of the currently available analysis schemes incorporate methods developed for normal or healthy voicing conditions. However, the existence of severe jitter, shimmer, noise and other forms of disturbances render these methods inappropriate for the analysis of pathological voices. The thesis outline is as follows. In Chapter 1, the most important physiological and morphological elements of speech production are analyzed. Functional alterations of these elements due to pathological origins, lead to the introduction of severe disturbances of glottal induced noise, jitter and shimmer in the radiated speech signal. Under this view, analysis of disordered speech consists of: 1) estimation of the parameters of the Vocal Tract and speech signal inversion into the glottal excitation, 2) derivation of speech and glottal disturbances, and 3) selection of important measures for the description of pathologic voices. In Chapter 2, we address the problem of Vocal Tract filter parameters estimation during noisy glottal function. Exponentially Damped Sinusoids (EDS) analysis is incorporated for this purpose, together with a new hybrid technique (MHOS-DEPE) of Higher-Order Statistics and an improved classical EDS analysis method. The examined methods are tested in parallel on synthetic signals. The MHOS-DEPE method is more efficient for lower SNRs and longer data records. The EDS analysis techniques perform considerably better than classical Linear Prediction. Based on the above framework, a new method for the exact estimation of the Inverse Vocal Tract Filter and Inverse Filtering is introduced (EDS-IF). The new method performs an exhaustive search for the parameters of the Vocal Tract in regions of previously detected formant frequencies. The sub-band analysis and the lower order estimation, incorporated in the proposed method, greatly improve the efficiency of the Inverse Filter construction and the glottal excitation estimates. Chapter 3 deals with the estimation of disturbances in the radiate speech signal. A new method (EDS-SNRio) for the separation of the noise component is proposed. The method exploits the previous EDS analysis schemes and the resulting SNR estimates avoid artifacts of classical methods due to high values of inharmonicity resulting from jitter/shimmer disturbances. The Waveform Matching (WM) technique is adopted for the estimation of objective measures of pitch and amplitude perturbation. Accordingly, Chapter 4 covers the problem of disturbance estimation in the glottal excitation. A denoising algorithm based on Discrete Cosine Transform and EDS analysis is employed on a per-period analysis. The proposed algorithm overcomes SNR estimation problems of classical methods and performs better than a wavelet-based and low-pass filtering denoising. Again, the WM technique is adopted. Additionally, a new family of jitter measures is introduced. These are based on correlation estimation of the fundamental period from energy thresholded portions of consecutive glottal cycles. The new indices may prove useful in the determination of possible non-linear behavior of the disturbance phenomena. Finally, a new spectrographic representation is introduced for the glottal disturbances under the term "Disturbogram". The Disturbogram and its underlying analytical method separate jitter, shimmer and noise in the glottal signal and offer quantitative and qualitative information about them. Its usefulness in the clinical voice evaluation is demonstrated both with synthetic and real voice signals. In Chapter 5, the initial development of the first Greek Voice Pathology database is described. A complete protocol for recording and voice screening procedures is introduced. An initial sample of 50 clinical patients of the University ORL Clinic of the AHEPA Hospital, Thessalonica, is collected and used for further voice analysis. Chapter 6 accounts for conducting acoustic analysis of the recorded pathological voices and selection of important descriptive features. Indeed, the inclusion of both speech and glottal disturbance indices confirms previously published findings about the range of vocal dysfunction. Rank correlation analysis, Principal Component Analysis and Mutual Information are employed for the selection of appropriate indices and determination of independent measures for the description of pathologic voices. Actually, the selected disturbance indices may be grouped into independent axes in such a way that reflects their functional origin (e.g. glottal vs. speech signals) and not their quantitative distinction (e.g. jitter, shimmer, noise, etc.). The Voice Component Profile (VCP) is a new graphic representation of the derived grouping of acoustic measures. VCP proves 15% more efficient in the discrimination of voice polyps from the rest of the recorded pathologies, than the Hoarseness Diagram, for a normalized Euclidean distance measure. Similar findings are obtained for the discrimination of normal and pathologic voices. Finally, Chapter 7 reviews the objectives and findings of the thesis and comments on future research directions.
περισσότερα