Elsevier

Neurocomputing

Volume 41, Issues 1–4, October 2001, Pages 47-65
Neurocomputing

An expert network architecture for learning appropriate further questions based on incomplete data

https://doi.org/10.1016/S0925-2312(00)00347-7Get rights and content

Abstract

We describe a neural network architecture for learning which (if any) further questions are necessary to make a correct diagnosis, given a set of known preliminary inputs. Question evaluation subnetworks learn when further questions are necessary, through positive feedback from misdiagnoses due to lack of necessary information, and negative feedback representing the cost of that information. These activate cutoff units which allow the network to simulate the effects of asking/not asking further questions. The network learns to ask only those further questions necessary to prevent error, and may also be used reach final conclusions once those questions have been answered.

Section snippets

Introduction and background

In general, human experts reach conclusions without having to know everything about a problem; instead, they usually form hypotheses based on the information readily at hand. If necessary, they may then ask for further specific information to confirm or disprove that hypothesis.

Consider a doctor dealing with a patient complaining of a fever and sore throat. If these symptoms are enough information to make a diagnosis, then some treatment will be recommended. Otherwise, the doctor will gather

Learning appropriate further questions

An alternative approach to this problem is to learn which (if any) further questions are necessary in what situations, based on a set of training examples. Since learning is the major strength of the neural network paradigm, this is a natural approach. In this section, we describe how this idea affects the nature of the inputs, outputs, and training examples.

An architecture for learning further questions

In this section we describe a neural network architecture [13] that learns which (if any) further questions are necessary to reduce output error, on the basis of the current preliminary symptoms. To illustrate this, Fig. 1 gives a general description of such a network for an unspecified number of inputs and outputs. Fig. 4 gives a description of this kind of network for a specific problem.

As mentioned above, the input units of the network are divided into:

  • Preliminary symptoms (P1Pn).

  • Further

Training examples

We assume that all training examples have “complete” input information, including all values for both preliminary symptom inputs and further question inputs. This is crucial to learning the effects of having/not having specific further question data. While many training sets do not have complete attribute values for all examples, most do at least have a representative subset of examples which are complete.

This means that each training example will contain the following:

  • All values for

Applying the network to problems

Once the network has been trained, it can be used as part of a two-stage process for determining which further questions are necessary, based on the values of the preliminary symptoms, and then for making a final diagnosis based on answers to those questions.

Example: Contact lens prescription

This system was first tested with a database for fitting contact lenses [1], acquired from the UCI machine learning archives [8]. The database consisted of 24 training examples, 4 inputs, and 3 possible diagnoses, as shown in Fig. 5.

This database was chosen because the input features were primarily binary-valued, and because the relationships between those features were complex enough to pose a challenge to the system. The first two features, age and tear production, were chosen as preliminary

More complex experiments

An additional artificial problem was created to further test the ability of the algorithm to scale up to larger number of inputs, as well as to learn from an incomplete set of training examples.

The problem consisted of 12 preliminary symptoms (labeled AL), 8 further questions (labeled MT), and a single output. The behavior to be learned consisted of the following complex set of rules:

1.A→out.
2.⌝B→out.
3.CM→out.
4.D⌝N→out.
5.(E⌝F)O→out.
6.(G⌝H)P→out.
7.(I⌝J)(Q⌝R)→out.
8.(K⌝L)(S⌝T)→out.

Representing continuous-valued further questions

One major problem with the architecture described in Section 3 is the implicit restriction that all further questions must be binary valued. This is due to the nature of the cutoff units, which are designed to pass information about the state of the corresponding further question to the rest of the network. That is, if Ci+ is active, then the network knows that Fi is active; if Ci is active, then the network knows that Fi is inactive. Unfortunately, this does not give the cutoff units any way

Example: Thyroid disease diagnosis

This system was tested with a thyroid disease diagnosis problem [9], also acquired from the UCI machine learning archives [8] (Fig. 7). The goal of this problem was to diagnose a condition of hypothyroid from a number of both binary and continuous-valued inputs. Binary-valued inputs included the following:

sex,psych,pregnant,on thyroxine,query on thyroxine,
sick,lithium,I131 treatment,query hypothyroid,query hyperthyroid,
tumor,goitre,hypopituitary,thyroid surgery,on antithyroid medicine,

Conclusions

We have described a neural network architecture for learning which (if any) further questions are necessary to make a correct diagnosis, given a set of known preliminary symptoms. Question evaluation subnetworks learn when further questions are necessary, through positive feedback from misdiagnoses due to lack of necessary information, and negative feedback representing the cost of obtaining that information. These activate cutoff units which correspond to those necessary further questions,

John R. Sullins received the B.S. degree in computer science from M.I.T. in 1983, M.S. degree in computer science from the University of Rochester in 1985, and Ph.D. in computer science from the University of Maryland in 1990. He is currently an Associate Professor of Computer Science and Information Systems at Youngstown State University, and his research interests include neural networks, expert systems, object-oriented programming, and curricular development.

References (15)

  • J. Cendrowska

    PRISMand algorithm for inducing modular rules

    Int. J. Man-Mach. Stud.

    (1987)
  • L.-M. Fu

    Integration of neural heuristics into knowledge-based inference. Connection Sci.

    (1989)
  • S.I. Gallant

    Connectionist expert systems. Commun. ACM

    (1988)
  • S.I. Gallant

    Expert systems and decision systems using neural networks

  • Y. Hayashi

    A neural expert system with automated extraction of fuzzy if-then rules and its application to medical diagnosis

  • M.I. Jordan et al.

    Modular and hierarchical learning systems

  • R.I. Lacher et al.

    Back-propagation learning in expert networks

    IEEE Trans. Neural Networks

    (1992)
There are more references available in the full text version of this article.

Cited by (0)

John R. Sullins received the B.S. degree in computer science from M.I.T. in 1983, M.S. degree in computer science from the University of Rochester in 1985, and Ph.D. in computer science from the University of Maryland in 1990. He is currently an Associate Professor of Computer Science and Information Systems at Youngstown State University, and his research interests include neural networks, expert systems, object-oriented programming, and curricular development.

View full text