Using the patient's questionnaire data to screen laryngeal disorders

https://doi.org/10.1016/j.compbiomed.2008.11.008Get rights and content

Abstract

This paper is concerned with soft computing techniques for screening laryngeal disorders based on patient's questionnaire data. By applying the genetic search, the most important questionnaire statements are determined and a support vector machine (SVM) classifier is designed for categorizing the questionnaire data into the healthy, nodular and diffuse classes. To explore the obtained automated decisions, the curvilinear component analysis (CCA) in the space of decisions as well as questionnaire statements is applied. When testing the developed tools on the set of data collected from 180 patients, the classification accuracy of 85.0% was obtained. Bearing in mind the subjective nature of the data, the obtained classification accuracy is rather encouraging. The CCA allows obtaining ordered two-dimensional maps of the data in various spaces and facilitates the exploration of automated decisions provided by the system and determination of relevant groups of patients for various comparisons.

Introduction

The diagnostic procedure of laryngeal diseases in clinical practice is rather complex and based on evaluation of patient's complaints, history and data of instrumental as well as histological examination. During the last two decades a variety of imaging techniques for examination of the larynx and objective measurements of voice quality have been developed [1], [2]. Evaluation of larynx has improved significantly with the establishment of the computer tomography (CT) and magnetic resonance imaging (MRI), as the technologies provide insights into the endoscopically blind areas and reveal depth of tumour infiltration. The technologies may be beneficial in staging larynx carcinoma and planning the most appropriate surgical procedure [3], [4], [5], [6]. Ultrasonography is useful in cases of larger laryngeal lesions and may have some role in screening unilateral vocal fold pathologies. At the same time, further fine-tuning of the technique may be necessary [7], [8].

Automated acoustic analysis of voice is increasingly used for detecting and screening laryngeal pathologies [9], [10], [11], [12], [13], [14], [15], [16]. According to Hadjitodorov and Mitev, depending on the disease and its stage, the following changes can be observed in the vocalized voice signal in pathological cases [10]:

  • (i)

    Significant cycle-to-cycle pitch and amplitude perturbations.

  • (ii)

    Decrease of the voice signal fundamental frequency and amplitude.

  • (iii)

    Dominance of the first harmonic in the signal spectrum.

  • (iv)

    Presence of a turbulent noise.

  • (v)

    Decrease or loss of the harmonics over 1 kHz and presence of sub-harmonics.

  • (vi)

    Pauses in the pitch period generation.

Attempts to create systems for automated analysis of color laryngeal images have also been made. In [17], a system for automated categorization of manually marked suspect lesions into healthy and diseased classes is presented. The categorization is based on textural features extracted from co-occurrence matrices computed from manually marked areas of vocal fold images. The classification accuracy of 81.4% was reported when testing the system on a very small set of 35 images. A set of 785 laryngeal images has been used in studies presented in [18], [19]. The classification accuracy of over 94% was achieved when categorizing the images into healthy and two pathological classes. When categorizing the same set of images into seven classes (one healthy and six pathological), the classification accuracy of over 80% was reported [20].

Hanson et al. used color, an average value of the normalized red component r given by Eq. (1), to quantify the degree of erythema [21]r=RR+G+Bwhere R, G and B are three components of the color images recorded by the color CCD camera. Five different areas were manually selected from each laryngeal image to estimate the r component. The value of the r component computed for normal subjects was compared to the r component values computed for patients with chronic laryngitis. It was found that the r values for patients with chronic posterior laryngitis were significantly higher than the r values computed for normal larynges. It is worth mentioning that the camera used was color balanced before each recording. However, variations in illumination, geometry and appearance of vocal folds have not been taken into consideration.

Two-dimensional imaging is successfully used in indirect autofluorescence and fluorescence laryngoscopy [22], [23], [24], [25]. Autofluorescence laryngoscopy is based on the fact that normal cells emit green fluorescence when exposed to blue light, while precancerous or cancerous lesions display a significant loss of green fluorescence and appear reddish. Autofluorescence and fluorescence imaging helps detection of borderlines of tumours/healthy tissue. Image analysis procedures used are limited, however, to image visualization. Quantification of color, texture and shape of lesions and normal tissue could help in more accurate categorization of lesions and follow up procedures.

The long-term goal of this work is a decision support system for diagnostics of laryngeal diseases. A voice signal, color images of vocal folds, and questionnaire data are the information sources used in the analysis. The results achieved when using color images of vocal folds for categorizing laryngeal diseases can be found elsewhere [19], [20]. The multiple feature sets based approach to analysis of voice signal and the results achieved by applying the technique for categorization of laryngeal diseases were presented in [16]. This paper is concerned with automated analysis of the questionnaire data applied to screening of laryngeal diseases.

The questionnaire data is an important, however, under-exploited, information source. The questionnaire data may carry information, which is not present in the acoustic or visual modalities. The aim of the work presented in this paper is threefold. First, selection of the most important questionnaire variables (features) for solving the task of the data classification into one healthy and two pathological classes is performed. The variable selection results give an indication which questionnaire statements are more important for correct classification of patients into the healthy and pathological classes. Second, a classifier for categorizing the questionnaire data into healthy and pathological classes is developed. The categorization results provide an indication on usefulness of the data for screening laryngeal pathologies. Observe that variable selection and classifier design is integrated into the same learning process based on genetic search. A support vector machine (SVM) is used as classifier. Third, the questionnaire data as well as classifier decisions obtained using the data are explored using a technique for mapping the data onto the two-dimensional space. The curvilinear component analysis (CCA) [26] is used for accomplishing the mapping. Next, we briefly describe the main topics of the work.

Section snippets

The classifier

A SVM is used as a classifier in this work. Assuming that Φ(x) is the nonlinear mapping of the data point x into the new space, the 1-norm soft margin SVM we use can be constructed by solving the following minimization problem [27]:minw,b,γ,ξ-γ+Ci=1Nξisubject toyi(w,Φ(xi)+b)γ-ξi,ξi0,w2=1,i=1,,Nwhere w is the weight vector, yi=±1 is the desired output (±1), N is the number of training data points, stands for the inner product, γ is the margin, ξi are the slack variables, b is the

Genetic search

Information representation in a chromosome, generation of initial population, evaluation of population members, selection, crossover, mutation, and reproduction (survival) are the issues to consider when designing a genetic search algorithm.

In our case, a chromosome contains all the information needed to build an SVM classifier. We divide the chromosome into three parts. One part encodes the regularization constant C, one the kernel width parameter σ, and the third one encodes the

Mapping data by the CCA

CCA is a nonlinear mapping technique, aiming to map the data in such a way that the local topology is preserved. The mapping is implemented by minimizing a cost function based on matching the inter-point distances in the input and output spaces [26].

Let the Euclidean distances between a pair of data points (i,j) be denoted as χij=d(xi,xj) and ϑij=d(yi,yj) in the input and the output space, respectively. Then, the cost function minimized to obtain the mapping is given by [26]E=12iji(χij-ϑij)2F

The data

The medical task considered in this paper concerns the query data based automated categorization of laryngeal disorders into three decision classes: healthy and two pathological classes, namely diffuse and nodular mass lesions of vocal folds [18]. The pathological classes can be characterized as follows. A rather common, clinically discriminative group of laryngeal diseases was chosen for the analysis, i.e. mass lesions of vocal folds. Mass lesions of vocal folds could be categorized into six

Experimental investigations

Data of 180 patients were available for the tests: 21 from the healthy class, 107 from the nodular and 52 from the diffuse class. The data were randomly partitioned into the learning set Sl containing data of 150 patients and the test set St with data of 30 patients. In all the tests involving estimation of the classification accuracy, we run an experiment 20 times with different random partitioning of the data set into the learning and tests subsets. The results presented here are average

Conclusions

Tools for screening laryngeal disorders based on the patient's questionnaire data were presented in this paper. The developed tools allow determining the most important questionnaire statements, designing a classifier to categorize the data into the healthy, nodular and diffuse classes, and exploring the obtained decisions and the data as well. The classification accuracy of 85.0% was obtained when testing the developed tools on the set of data collected from 180 patients. All the data coming

Acknowledgment

We gratefully acknowledge the support from the agency for international science and technology development programmes in Lithuania (COST Action 2103).

Antanas Verikas is currently holding a professor position at both Halmstad University, Sweden, and Kaunas University of Technology, Lithuania. His research interests include image processing, pattern recognition, artificial neural networks, fuzzy logic and visual media technology. He is a member of the International Pattern Recognition Society, European Neural Network Society, International Association of Science and Technology for Development, Swedish Society of Learning Systems, and a member

References (34)

  • A.N. Hasso et al.

    Magnetic resonance imaging of the pharynx and larynx

    Topics in Magnetic Resonance Imaging

    (1994)
  • G. Schade et al.

    Sonography of the larynx—an alternative to laryngoscopy?

    HNO

    (2003)
  • B. Boyanov et al.

    Acoustic analysis of pathalogical voices. A voice analysis system for the screening of laryngeal diseases

    IEEE Engineering in Medicine and Biology Magazine

    (1997)
  • R.J. Moran et al.

    Telephony-based voice pathology assessment using automated speech analysis

    IEEE Transactions on Biomedical Engineering

    (2006)
  • K. Umapathy et al.

    Discrimination of pathological voices using a time–frequency approach

    IEEE Transactions on Biomedical Engineering

    (2005)
  • S. Hadjitodorov et al.

    Laryngeal pathology detection by means of class-specific neural maps

    IEEE Transactions on Information Technology in Biomedicine

    (2000)
  • J.I. Godino-Llorente et al.

    Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors

    IEEE Transactions on Biomedical Engineering

    (2004)
  • Cited by (15)

    • Dysphonia screening in vocally trained and untrained children

      2020, International Journal of Pediatric Otorhinolaryngology
      Citation Excerpt :

      Children can fill the questionnaire individually or with assistance of a parent/teacher. The results of solitary studies indicated that voice-related questionnaire data can also be used for voice screening purposes; however, only scanty attempts in this field have been made [8,9], thus, voice-related questionnaires are still seldom used for screening of laryngeal disorders. The Glottal Function Index (GFI) questionnaire developed and validated by Bach et al., in 2005 represents an easily self-administered and reliable 4-item battery designed to assess the presence and degree of vocal dysfunction in adults [10].

    • Fusing voice and query data for non-invasive detection of laryngeal disorders

      2015, Expert Systems with Applications
      Citation Excerpt :

      Verikas et al. (2009) determined the most important questionnaire statements of 14 available using the genetic search, and used them in an SVM to categorize questionnaire data into the healthy class and two classes of pathologies, nodular and diffuse. When testing the developed tools on data collected from 180 subjects the 85% classification accuracy was obtained (Verikas et al., 2009) and later confirmed on data from 240 subjects (Verikas et al., 2010a). Fusion of image, voice and query modalities by Verikas et al. (2010a) helped to improve classification accuracy to over 98%, but obtaining image data requires an invasive procedure.

    • Questionnaire- versus voice-based screening for laryngeal disorders

      2012, Expert Systems with Applications
      Citation Excerpt :

      By applying genetic operations to the generated offsprings iteratively, it is expected that a generation of good chromosomes leading to a better solution will be created. A detailed description of the GA can be found in Verikas et al. (2009). By using canonical correlation analysis (CCA) (Hair et al., 1998; Hardoon et al., 2004; Kuss & Graepel, 2003), one usually aims answering the following kinds of questions:

    • Combining image, voice, and the patient's questionnaire data to categorize laryngeal disorders

      2010, Artificial Intelligence in Medicine
      Citation Excerpt :

      The questionnaire data may carry information, which is not present in the acoustic or visual modalities. In [20], a genetic search and support vector machine (SVM) based technique to categorize the patient’s questionnaire data was presented. The categorization results provide an indication on usefulness of the data for screening laryngeal pathologies.

    View all citing articles on Scopus

    Antanas Verikas is currently holding a professor position at both Halmstad University, Sweden, and Kaunas University of Technology, Lithuania. His research interests include image processing, pattern recognition, artificial neural networks, fuzzy logic and visual media technology. He is a member of the International Pattern Recognition Society, European Neural Network Society, International Association of Science and Technology for Development, Swedish Society of Learning Systems, and a member of the IEEE.

    Adas Gelzinis received the M.S. degree in electrical engineering from Kaunas University of Technology, Lithuania, in 1995. He received the Ph.D. degree in computer science from the same university, in 2000. He is a senior researcher in the Department of Applied Electronics at Kaunas University of Technology. His research interests include artificial neural networks, kernel methods, pattern recognition, signal and image processing, texture classification.

    Marija Bacauskiene is a senior researcher in the Department of Applied Electronics at Kaunas University of Technology, Lithuania. Her research interests include machine learning, image processing, pattern recognition and fuzzy logic. She participated in various research projects and published numerous papers in these areas.

    Virgilijus Uloza is a professor of otorhinolaryngology and head of the Department of Otolaryngology, Kaunas University of Medicine, Lithuania. Areas of his scientific research include acoustic voice analysis, IT in medicine, telemedicine, recognition of medical images, impact of electromagnetic fields on hearing. He is a Board Member of International Association of Phonosurgery, a Board Member of World Voice Consortium, member of European Laryngologic Society and member of European Laryngological Research Group.

    Marius Kaseta is a doctoral student of the Department of Otolaryngology, Kaunas University of Medicine, Lithuania. Areas of his scientific research include acoustic voice analysis, IT in medicine, recognition of medical images. He is a member of Lithuanian Otorhinolaryngological Society and Lithuanian Society for Biomedical Engineering.

    View full text