Using the patient's questionnaire data to screen laryngeal disorders
Introduction
The diagnostic procedure of laryngeal diseases in clinical practice is rather complex and based on evaluation of patient's complaints, history and data of instrumental as well as histological examination. During the last two decades a variety of imaging techniques for examination of the larynx and objective measurements of voice quality have been developed [1], [2]. Evaluation of larynx has improved significantly with the establishment of the computer tomography (CT) and magnetic resonance imaging (MRI), as the technologies provide insights into the endoscopically blind areas and reveal depth of tumour infiltration. The technologies may be beneficial in staging larynx carcinoma and planning the most appropriate surgical procedure [3], [4], [5], [6]. Ultrasonography is useful in cases of larger laryngeal lesions and may have some role in screening unilateral vocal fold pathologies. At the same time, further fine-tuning of the technique may be necessary [7], [8].
Automated acoustic analysis of voice is increasingly used for detecting and screening laryngeal pathologies [9], [10], [11], [12], [13], [14], [15], [16]. According to Hadjitodorov and Mitev, depending on the disease and its stage, the following changes can be observed in the vocalized voice signal in pathological cases [10]:
- (i)
Significant cycle-to-cycle pitch and amplitude perturbations.
- (ii)
Decrease of the voice signal fundamental frequency and amplitude.
- (iii)
Dominance of the first harmonic in the signal spectrum.
- (iv)
Presence of a turbulent noise.
- (v)
Decrease or loss of the harmonics over 1 kHz and presence of sub-harmonics.
- (vi)
Pauses in the pitch period generation.
Hanson et al. used color, an average value of the normalized red component given by Eq. (1), to quantify the degree of erythema [21]where , and are three components of the color images recorded by the color CCD camera. Five different areas were manually selected from each laryngeal image to estimate the component. The value of the component computed for normal subjects was compared to the component values computed for patients with chronic laryngitis. It was found that the values for patients with chronic posterior laryngitis were significantly higher than the values computed for normal larynges. It is worth mentioning that the camera used was color balanced before each recording. However, variations in illumination, geometry and appearance of vocal folds have not been taken into consideration.
Two-dimensional imaging is successfully used in indirect autofluorescence and fluorescence laryngoscopy [22], [23], [24], [25]. Autofluorescence laryngoscopy is based on the fact that normal cells emit green fluorescence when exposed to blue light, while precancerous or cancerous lesions display a significant loss of green fluorescence and appear reddish. Autofluorescence and fluorescence imaging helps detection of borderlines of tumours/healthy tissue. Image analysis procedures used are limited, however, to image visualization. Quantification of color, texture and shape of lesions and normal tissue could help in more accurate categorization of lesions and follow up procedures.
The long-term goal of this work is a decision support system for diagnostics of laryngeal diseases. A voice signal, color images of vocal folds, and questionnaire data are the information sources used in the analysis. The results achieved when using color images of vocal folds for categorizing laryngeal diseases can be found elsewhere [19], [20]. The multiple feature sets based approach to analysis of voice signal and the results achieved by applying the technique for categorization of laryngeal diseases were presented in [16]. This paper is concerned with automated analysis of the questionnaire data applied to screening of laryngeal diseases.
The questionnaire data is an important, however, under-exploited, information source. The questionnaire data may carry information, which is not present in the acoustic or visual modalities. The aim of the work presented in this paper is threefold. First, selection of the most important questionnaire variables (features) for solving the task of the data classification into one healthy and two pathological classes is performed. The variable selection results give an indication which questionnaire statements are more important for correct classification of patients into the healthy and pathological classes. Second, a classifier for categorizing the questionnaire data into healthy and pathological classes is developed. The categorization results provide an indication on usefulness of the data for screening laryngeal pathologies. Observe that variable selection and classifier design is integrated into the same learning process based on genetic search. A support vector machine (SVM) is used as classifier. Third, the questionnaire data as well as classifier decisions obtained using the data are explored using a technique for mapping the data onto the two-dimensional space. The curvilinear component analysis (CCA) [26] is used for accomplishing the mapping. Next, we briefly describe the main topics of the work.
Section snippets
The classifier
A SVM is used as a classifier in this work. Assuming that is the nonlinear mapping of the data point into the new space, the 1-norm soft margin SVM we use can be constructed by solving the following minimization problem [27]:subject towhere is the weight vector, is the desired output , is the number of training data points, stands for the inner product, is the margin, are the slack variables, is the
Genetic search
Information representation in a chromosome, generation of initial population, evaluation of population members, selection, crossover, mutation, and reproduction (survival) are the issues to consider when designing a genetic search algorithm.
In our case, a chromosome contains all the information needed to build an SVM classifier. We divide the chromosome into three parts. One part encodes the regularization constant , one the kernel width parameter , and the third one encodes the
Mapping data by the CCA
CCA is a nonlinear mapping technique, aiming to map the data in such a way that the local topology is preserved. The mapping is implemented by minimizing a cost function based on matching the inter-point distances in the input and output spaces [26].
Let the Euclidean distances between a pair of data points be denoted as and in the input and the output space, respectively. Then, the cost function minimized to obtain the mapping is given by [26]
The data
The medical task considered in this paper concerns the query data based automated categorization of laryngeal disorders into three decision classes: healthy and two pathological classes, namely diffuse and nodular mass lesions of vocal folds [18]. The pathological classes can be characterized as follows. A rather common, clinically discriminative group of laryngeal diseases was chosen for the analysis, i.e. mass lesions of vocal folds. Mass lesions of vocal folds could be categorized into six
Experimental investigations
Data of 180 patients were available for the tests: 21 from the healthy class, 107 from the nodular and 52 from the diffuse class. The data were randomly partitioned into the learning set containing data of 150 patients and the test set with data of 30 patients. In all the tests involving estimation of the classification accuracy, we run an experiment 20 times with different random partitioning of the data set into the learning and tests subsets. The results presented here are average
Conclusions
Tools for screening laryngeal disorders based on the patient's questionnaire data were presented in this paper. The developed tools allow determining the most important questionnaire statements, designing a classifier to categorize the data into the healthy, nodular and diffuse classes, and exploring the obtained decisions and the data as well. The classification accuracy of 85.0% was obtained when testing the developed tools on the set of data collected from 180 patients. All the data coming
Acknowledgment
We gratefully acknowledge the support from the agency for international science and technology development programmes in Lithuania (COST Action 2103).
Antanas Verikas is currently holding a professor position at both Halmstad University, Sweden, and Kaunas University of Technology, Lithuania. His research interests include image processing, pattern recognition, artificial neural networks, fuzzy logic and visual media technology. He is a member of the International Pattern Recognition Society, European Neural Network Society, International Association of Science and Technology for Development, Swedish Society of Learning Systems, and a member
References (34)
- et al.
Perceptual and acoustic assessment of voice pathology and the efficacy of endolaryngeal phonomicrosurgery
Journal of Voice
(2005) - et al.
Reliability of interpretation of CT examination of the larynx in patients with glottic laryngeal carcinoma
Archives of Otolaryngology—Head & Neck Surgery
(2006) - et al.
The potential role of ultrasound in differentiating solid and cystic swellings of the true vocal fold
Journal of Voice
(2004) - et al.
A computer system for acoustic analysis of pathological voices and laryngeal diseases screening
Medical Engineering & Physics
(2002) - et al.
Automated speech analysis applied to laryngeal disease categorization
Computer Methods and Programs in Biomedicine
(2008) - et al.
Towards a computer-aided diagnosis system for vocal cord diseases
Artificial Intelligence in Medicine
(2006) - et al.
Multiple feature sets based categorization of laryngeal images
Computer Methods and Programs in Biomedicine
(2007) - M.F. Mafee, G.E. Valvassori, M. Becker, Imaging of the Neck and Head, second ed., Thieme,...
- et al.
Imaging in head and neck cancer
Current Treatment Options in Oncology
(2006) - et al.
Imaging diagnostics of the pharynx and larynx
Radiologe
(2005)
Magnetic resonance imaging of the pharynx and larynx
Topics in Magnetic Resonance Imaging
Sonography of the larynx—an alternative to laryngoscopy?
HNO
Acoustic analysis of pathalogical voices. A voice analysis system for the screening of laryngeal diseases
IEEE Engineering in Medicine and Biology Magazine
Telephony-based voice pathology assessment using automated speech analysis
IEEE Transactions on Biomedical Engineering
Discrimination of pathological voices using a time–frequency approach
IEEE Transactions on Biomedical Engineering
Laryngeal pathology detection by means of class-specific neural maps
IEEE Transactions on Information Technology in Biomedicine
Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors
IEEE Transactions on Biomedical Engineering
Cited by (15)
Dysphonia in Pediatric Professional Voice Users: Is It Just Nodules?
2023, Journal of VoiceDysphonia screening in vocally trained and untrained children
2020, International Journal of Pediatric OtorhinolaryngologyCitation Excerpt :Children can fill the questionnaire individually or with assistance of a parent/teacher. The results of solitary studies indicated that voice-related questionnaire data can also be used for voice screening purposes; however, only scanty attempts in this field have been made [8,9], thus, voice-related questionnaires are still seldom used for screening of laryngeal disorders. The Glottal Function Index (GFI) questionnaire developed and validated by Bach et al., in 2005 represents an easily self-administered and reliable 4-item battery designed to assess the presence and degree of vocal dysfunction in adults [10].
Fusing voice and query data for non-invasive detection of laryngeal disorders
2015, Expert Systems with ApplicationsCitation Excerpt :Verikas et al. (2009) determined the most important questionnaire statements of 14 available using the genetic search, and used them in an SVM to categorize questionnaire data into the healthy class and two classes of pathologies, nodular and diffuse. When testing the developed tools on data collected from 180 subjects the 85% classification accuracy was obtained (Verikas et al., 2009) and later confirmed on data from 240 subjects (Verikas et al., 2010a). Fusion of image, voice and query modalities by Verikas et al. (2010a) helped to improve classification accuracy to over 98%, but obtaining image data requires an invasive procedure.
Questionnaire- versus voice-based screening for laryngeal disorders
2012, Expert Systems with ApplicationsCitation Excerpt :By applying genetic operations to the generated offsprings iteratively, it is expected that a generation of good chromosomes leading to a better solution will be created. A detailed description of the GA can be found in Verikas et al. (2009). By using canonical correlation analysis (CCA) (Hair et al., 1998; Hardoon et al., 2004; Kuss & Graepel, 2003), one usually aims answering the following kinds of questions:
Combining image, voice, and the patient's questionnaire data to categorize laryngeal disorders
2010, Artificial Intelligence in MedicineCitation Excerpt :The questionnaire data may carry information, which is not present in the acoustic or visual modalities. In [20], a genetic search and support vector machine (SVM) based technique to categorize the patient’s questionnaire data was presented. The categorization results provide an indication on usefulness of the data for screening laryngeal pathologies.
Towards noninvasive screening for malignant tumours in human larynx
2010, Medical Engineering and Physics
Antanas Verikas is currently holding a professor position at both Halmstad University, Sweden, and Kaunas University of Technology, Lithuania. His research interests include image processing, pattern recognition, artificial neural networks, fuzzy logic and visual media technology. He is a member of the International Pattern Recognition Society, European Neural Network Society, International Association of Science and Technology for Development, Swedish Society of Learning Systems, and a member of the IEEE.
Adas Gelzinis received the M.S. degree in electrical engineering from Kaunas University of Technology, Lithuania, in 1995. He received the Ph.D. degree in computer science from the same university, in 2000. He is a senior researcher in the Department of Applied Electronics at Kaunas University of Technology. His research interests include artificial neural networks, kernel methods, pattern recognition, signal and image processing, texture classification.
Marija Bacauskiene is a senior researcher in the Department of Applied Electronics at Kaunas University of Technology, Lithuania. Her research interests include machine learning, image processing, pattern recognition and fuzzy logic. She participated in various research projects and published numerous papers in these areas.
Virgilijus Uloza is a professor of otorhinolaryngology and head of the Department of Otolaryngology, Kaunas University of Medicine, Lithuania. Areas of his scientific research include acoustic voice analysis, IT in medicine, telemedicine, recognition of medical images, impact of electromagnetic fields on hearing. He is a Board Member of International Association of Phonosurgery, a Board Member of World Voice Consortium, member of European Laryngologic Society and member of European Laryngological Research Group.
Marius Kaseta is a doctoral student of the Department of Otolaryngology, Kaunas University of Medicine, Lithuania. Areas of his scientific research include acoustic voice analysis, IT in medicine, recognition of medical images. He is a member of Lithuanian Otorhinolaryngological Society and Lithuanian Society for Biomedical Engineering.