Discrete Optimization
Selecting and weighting features using a genetic algorithm in a case-based reasoning approach to personnel rostering

https://doi.org/10.1016/j.ejor.2004.12.028Get rights and content

Abstract

Personnel rostering problems are highly constrained resource allocation problems. Human rostering experts have many years of experience in making rostering decisions which reflect their individual goals and objectives. We present a novel method for capturing nurse rostering decisions and adapting them to solve new problems using the Case-Based Reasoning (CBR) paradigm. This method stores examples of previously encountered constraint violations and the operations that were used to repair them. The violations are represented as vectors of feature values. We investigate the problem of selecting and weighting features so as to improve the performance of the case-based reasoning approach. A genetic algorithm is developed for off-line feature selection and weighting using the complex data types needed to represent real-world nurse rostering problems. This approach significantly improves the accuracy of the CBR method and reduces the number of features that need to be stored for each problem. The relative importance of different features is also determined, providing an insight into the nature of expert decision making in personnel rostering.

Introduction

Nurse rostering can be defined to be the problem of placing resources (nurses), subject to constraints, into slots in a pattern, where the pattern denotes a set of legal shifts defined in terms of work that needs to be done [30]. A wide variety of constraints can be imposed on rosters depending on the legal, management, and staffing requirements of individual organisations. Definitions of roster quality and optimality are highly subjective and therefore difficult to represent systematically using utility functions or rule bases. Human rostering experts have many years of experience in making rostering decisions which reflect their individual goals and objectives.

Nurse rostering problems have been solved using a variety of different mathematical and artificial intelligence methods. They are usually modelled as optimisation problems but the objective functions used vary considerably between problems. Bailey [3], Beaumont [6], and Warner [28] use mathematical programming techniques to generate nurse rosters optimised with respect to staffing costs, under-staffing costs, and shift pattern penalties. Constraint satisfaction techniques have been developed by Abdennadher and Schlenker [1], Cheng et al. [11], and Meyer auf’m Hofe [20] which allow the definition of many different types of constraint. A number of meta-heuristic approaches have been explored including genetic algorithms [13], simulated annealing [4], tabu search [8], [12], and hyper-heuristics [10]. A CBR approach by Scott and Simpson [26] combined case-based reasoning with constraint logic programming by storing shift patterns used for the construction of nurse rosters.

Case-based repair generation (CBRG) is a technique developed by the authors to solve nurse rostering problems [7] which uses case-based reasoning (CBR). CBR is a reasoning paradigm in which new problems are solved using the solutions to similar problems that have previously been encountered [18]. Previous problems and their corresponding solutions are stored as cases in a database called a case-base. New problems are compared to the cases in the case-base and the most similar is retrieved. The solution to the problem from the retrieved case is then adapted to the context of the new problem. If the new solution could be useful for future problem solving then it is stored in the case-base, thus increasing the total knowledge held.

The CBRG method considers each constraint violation in a roster as a separate problem. The case-base contains a history of previous constraint violations and the operations that were used to repair them. Cases are retrieved from the case-base using a two stage retrieval process [23]. The first stage retrieves those cases containing violations of the same type as the current problem. The second stage calculates the similarity of these cases to the current problem using the weighted nearest neighbour method. The violations are represented by a set of characteristic features and can be interpreted as points in a feature space. Weights are assigned to the features representing their relative importance. The most similar case is then defined as the one with the smallest weighted distance from the feature vector representing the current problem. It is vital for the retrieval process that appropriate features are selected to represent the violations and that these features are carefully weighted.

One of the most common ways to determine the accuracy of a case-base is to measure its classification accuracy. The CBRG method can be seen as a classifier which determines the type and parameters of a repair for a given violation. Its classification accuracy can be measured by repeatedly removing a case from the case-base, performing a retrieval to determine the nearest case to the removed case, and then comparing the repairs. In the literature, nearest neighbour classification algorithms [14] have been used successfully to solve a number of different classification problems. They allow complex relationships between input parameters to be captured without the need to model them explicitly. However, they can be sensitive to noise in the data sets and erroneous or irrelevant features [2]. These effects can be reduced by selecting only relevant features from the feature set and assigning a weight to each feature representing its relative importance. A number of different feature weighting and selection methods have been developed including Salzberg’s [25] feature weighting algorithm based on a heuristic approach for his EACH classification method, a random mutation hill climbing approach for feature selection by Skalak [27], and a genetic algorithm by Kuncheva and Jain [19]. Many more algorithms are described in a review by Wettschereck et al. [29]. We investigate an approach to automated weighting and feature selection based on the genetic algorithm based GA-WKNN developed by Kelly and Davis [17] and a dimensionality reduction algorithm developed by Raymer et al. [24]. These approaches are adapted so that they can handle the types of data used in the CBRG method to model the nurse rostering problem.

In this paper we present an adaptation of a feature weighting and selection algorithm to a complex real life nurse rostering problem. This algorithm allows us to learn which features are important when making rostering decisions and which features are irrelevant, thus increasing our understanding of the nurse rostering problem. The accuracy of the CBRG method is increased by weighting the features and the search time is decreased by reducing the number of features that it is necessary to store in each case. Furthermore, the flexibility and adaptability of the case-based approach is enhanced because its behaviour can be tuned more precisely to the decision making style of the expert who trained it. The data used for the experiments in this paper has been derived from rosters provided by the ophthalmology ward at the Queens Medical Centre University Hospital Trust (QMC) in Nottingham, United Kingdom.

The nurse rostering problem is introduced in Section 2 and the CBRG method is described in Section 3. Section 4 introduces the different types of features used to describe the violations. The modified genetic algorithm for feature weighting and selection is presented in Section 5. The results obtained by applying the algorithm to a case-base of real life rostering decisions are presented in Section 6. Section 7 concludes the paper.

Section snippets

The nurse rostering problem

The nurse rostering problem is represented by the ordered pairR=N,C,whereN={nursei:0i<I}is the set of I nurses to be rostered, andC={constraintk:0k<K}is the set of K constraints. The set N contains information about the nurses to be rostered, the shifts they have been assigned and the shifts that they would prefer to work over the rostering period. The set C imposes constraints on the shift assignments in N.

Each nurse is denoted by a 4-tuple,nursei=NurseTypei,hoursi,NRi,NPi,where NurseType

The case-based repair generation method

The case-based repair generation (CBRG) method was developed by the authors to capture examples of individual constraint violations and the repairs that were used by human experts to solve them [23]. The violations and repairs are stored as cases in a case-base and are used to solve new violations in new rosters. When a new violation is identified in a roster the case containing the most similar violation in the case-base is retrieved. The repair from the retrieved case is used to generate a

Violation features

The first stage of the retrieval process chooses cases that are structurally the same as the focus violation. A large number of such cases can exist within a case-base and therefore it is necessary to rank them according to their violation features. The violation features are statistical characteristics of the roster and the violation. They can be seen as a ‘snap-shot’ of the state of the roster at the time the violation was repaired. They are considered to be important when making rostering

Genetic algorithm for feature weighting and selection

The nearest neighbour distance function which is used in the retrieval process requires a good selection of features and an appropriate set of feature weights. The effect of an increase in the weight of a particular feature is an increase in the influence that the feature has on the selection process. By decreasing their weighting, irrelevant features exert less influence on the calculation of the distance between cases, thus increasing the accuracy of the system.

It is not always the case that

Results

The algorithm was used to select features and feature weights based on a case-base trained using the expert rostering knowledge of nurses at the QMC. It was trained over two months on rosters involving 12 different constraints:

  • 1.

    Cover: EARLY shifts require 4 Qualified Nurses.

  • 2.

    Cover: EARLY shifts require 1 Registered Nurse.

  • 3.

    Cover: EARLY shifts require 1 Eye-Trained Nurse.

  • 4.

    Cover: LATE shifts require 3 Qualified Nurses.

  • 5.

    Cover: LATE shifts require 1 Registered Nurse.

  • 6.

    Cover: LATE shifts require 1

Conclusion

This paper has described a method for the automated selection and weighting of features for a case-based reasoning approach to nurse rostering. A genetic algorithm is used to find a subset of weighted features by searching for combinations of features and corresponding feature weights that increase the overall classification accuracy of the case-base retrieval method. The increase in classification accuracy improves the quality of the repairs that are generated by the CBRG method by ensuring

Acknowledgements

This research is supported by the Engineering and Physical Sciences Research Council (EPSRC) in the UK (grant number GR/N35205/01) and by the Queen’s Medical Centre University Hospital Trust, Nottingham.

References (30)

  • E.K. Burke, P. De Causmaecker, S. Petrovic, G. Vanden Berghe, Fitness evaluation for nurse scheduling problems, in:...
  • E.K. Burke et al.

    A tabu-search hyperheuristic for timetabling and rostering

    Journal of Heuristics

    (2003)
  • B.M.W. Cheng, J.H.M. Lee, J.C.K. Wu, A constriant-based nurse rostering system using a redundant modeling approach....
  • A. Duenas, N. Mort, C. Reeves, D. Petrovic, Handling preferences using genetic algorithms for the nurse scheduling...
  • Cited by (88)

    • A three-stage mixed integer programming approach for optimizing the skill mix and training schedules for aircraft maintenance

      2018, European Journal of Operational Research
      Citation Excerpt :

      For some tasks for example, it is prohibited by law to involve people without the necessary skills or qualifications. Therefore, Beddoe, Petrovic, and Li (2009) and Beddoe and Petrovic (2006) talk about eye-training instead of on the job training. They present a genetic algorithm for the automated selection and weighting of features for a case-based reasoning approach to nurse rostering.

    View all citing articles on Scopus
    View full text