1 Introduction

Medical examination is considered as an important part of the hospital affairs, providing technical services for all medical treatments in the hospital (Wang et al. 2015; Qiu et al. 2017). Almost all the departments in the hospital usually are strongly related to medical examination. As the number of outpatients and inpatients in the hospital continues to increase, it is very important for the hospital to use medical examination equipment rationally, reduce patients’ waiting time, shorten the average hospitalization days and better necessary medical care. To improve the efficiency of medical resources is a huge challenge for medical examination.

The limited resources of medical examination make it important to allocate and schedule the medical resources effectively for both outpatients and inpatients. The goal of scheduling is to properly match demands and capacities to reduce patients’ waiting time and achieve more efficient uses of medical resources. According to the waiting time of the patients, the appointment schedules are divided into the pre-schedules before the service day and the schedules on the service day. The corresponding delay times for the patients are the sum of the direct waiting time and the indirect waiting time respectively. The direct waiting time means the one from the service request to the scheduling time. The indirect waiting time is the one from the scheduling time to the actual service time.

Medical examination almost covers all types of patients in the hospital system. Therefore, the hospital administrators are very concerned about how to solve the allocation problem of medical examination. In the actual work process, however, due to the lack of effective information-based planning methods, the hospital staff must arrange the examination time of the patients artificially according to their work experiences. In the recent years, the research about mathematics, operation research and information sciences is developing rapidly in the related fields of allocating medical resources reasonably and effectively (Zhuang and Li 2010). Using queuing theories, a scheduling method for single medical equipment was provided (Green et al. 2006). Wang and Fung (2015) proposed an approximate dynamic programming model with multiple service priority levels for medical scheduling. A Markov decision process principle was used by Patrick et al. (2008) to establish a stochastic dynamic programming model for medical scheduling. The above algorithm models gave corresponding research methods and theoretical models for different assumptions and practical problems in scheduling medical examination (Gupta and Wang 2008). However, the historical medical scheduling data saved in the database were not used effectively in these methods.

With the rapid development of machine learning technologies, how to use existing medical data information and provide effective technical services for modern medicine has been focused and some obvious progresses have been made (West et al. 2005; Su et al. 2017). In these machine learning methods, support vector machine (SVM) shows a good prediction effect in the classification and prediction of small sample medical data (Chen et al. 2011; Bai et al. 2015). Different from the above methods, this paper makes full use of the existing medical data of historical outpatients. The machine learning method, specifically SVM, is used to learn the working experiences of medical technicians automatically and establish a corresponding nonlinear scheduling prediction model. Taking the medical examination schedules of outpatients as the research object, an AdaBoost ensemble learning model using SVM is given. The learning model establishes a training data set by separating the data of the medical examination dates based on the historical scheduling information. Experimental results show that this model can predict the waiting time of outpatients accurately.

2 Main idea of SVM

SVM is a classification and prediction technique for small sample data, which has been widely used in the fields of pattern recognition and data mining (Moore and Zuev 2005). By studying sample data, SVM obtains a hyperplane with maximum classification interval, and then classifies the data optimally.

Given a training sample set (\( x_i \), \( y_i \)), \( x_i \in R^l \), \( i=1,2,\ldots ,l \), where \( R^l \) is the l—dimensional data space, \( x_i \) is the a data point and \( y_i \in \{-1, +1\} \) is the type of \( x_i \). The goal of SVM classification is to find a classifier, which can make the correct type prediction of the new unknown sample data. In order to achieve it, SVM constructs an optimal classification hyperplane as the classification decision surface. When the training set is linearly separable, the optimal hyperplane of SVM is recorded as:

$$\begin{aligned} ( w \cdot x_i ) + b = 0, \end{aligned}$$
(1)

so there are altogether two planes in space:

$$\begin{aligned} ( w \cdot x_i ) + b \ge 1,\quad y_i= & {} 1, \end{aligned}$$
(2)
$$\begin{aligned} ( w \cdot x_i ) + b \le 1,\quad y_i= & {} -1, \end{aligned}$$
(3)

Equations(2, 3) are combined and then written as follows:

$$\begin{aligned} 1-y_i[( w \cdot x_i ) + b] \le 0,i = 1,2,\ldots ,l. \end{aligned}$$
(4)

For the separated hyperplane represented by Eq. (1), the parameter (wb) is not determined uniquely. Therefore, the goal is to find a pair (wb) which makes Eq. (1) tenable in the above inequalities. The parameters w and b making the largest distance \( d_{max} \) between two planes \( w^Tx+b=1 \) and \( w^Tx+b=-1 \) (T means matrix transpose) will be defined as:

$$\begin{aligned} d_{max}=\frac{1}{2}\Vert w\Vert ^2_{ min }=\frac{1}{2}( w \cdot w )_{ min }. \end{aligned}$$
(5)

If the training data are linearly inseparable, the lower-dimensional sample space should be mapped to a higher-dimensional sample space. Then the sample data can be linearly separated in the high-dimensional space, represented by \( x \rightarrow \phi (x) \) ,where \( \phi (x) \) is the mapped feature vector. Therefore, the model corresponding to the hyperplane in the nonlinear feature space can be expressed as:

$$\begin{aligned} f(x)=w \cdot \phi (x) + b. \end{aligned}$$
(6)

By introducing the penalty factor \( C (C>=0) \) and the non-negative relaxation variable \( \varepsilon _i ( \varepsilon _i \ge 0), i=1,2,\ldots ,l \), the optimization problem of \( d_{ max} \) in Eq.(5) is converted to the equation as follows:

$$\begin{aligned} d_{ max}=\frac{1}{2}\Vert w\Vert ^2_{ min}+C \sum ^l_{i=1}\varepsilon _i. \end{aligned}$$
(7)

The inequality constraint corresponding to Eq. (7) is expressed as follows:

$$\begin{aligned} y_i[(w \cdot \phi (x_i))+ b] \ge 1-\varepsilon _i,i=1,2,\ldots ,l. \end{aligned}$$
(8)

It is obvious that Eq. (7) is a KKT condition problem under the constraint of Eq.(8), where the KKT condition is a sufficient and necessary condition for a nonlinear programming problem to have an optimal solution. Using the Lagrange multiplier method, Eq. (7) is converted to a dual problem:

$$\begin{aligned} { min} _{w,b,\alpha }L(w,b,\alpha )=\frac{1}{2}\Vert w\Vert ^2+\sum ^l_{i=1} \alpha _i(1-y_i(w \cdot \phi (x_i)+ b)), \end{aligned}$$
(9)

where \( \alpha \) is the Lagrange multiplier, \( \alpha =(\alpha _1,\alpha _2,\ldots ,\alpha _i,\ldots ,\alpha _l) \). To get the optimal solution and eliminate the parameters w and b, Eq. (8) is converted to the equation as follows:

$$\begin{aligned} { min} _\alpha L(\alpha )=\frac{1}{2}\sum ^l_{i=1} \sum ^l_{j=1} \alpha _i \alpha _j y_i y_j(\phi (x_i) \cdot \phi (x_j))-\sum ^l_{i=1}\alpha _i, \end{aligned}$$
(10)

which is equivalent to:

$$\begin{aligned} { max}\left( \sum ^l_{i=1}\alpha _i- \frac{1}{2}\sum ^l_{i,j=1} \alpha _i \alpha _j y_i y_j(\phi (x_i) \cdot \phi (x_j))\right) , \end{aligned}$$
(11)

where

$$\begin{aligned} \left\{ \begin{array}{lr} \sum ^l_{i=1}\alpha _i y_i=0, &{} \\ 0 \le \alpha _i \le C, &{} i=1,2,\ldots ,l \end{array} \right. \end{aligned}$$
(12)

3 SVM ensemble learning

By combining several simple basic classifiers, ensemble learning completes high-precision estimation of new data samples, which can overcome over-learning effectively and improve the accuracy of classification and prediction. In 1990, the first ensemble-learning-based neural network integration was produced (Salamon and Hansen 1990). The method combines a set of neural networks by voting, through which the generalization ability of the learning system can be significantly improved by training multiple neural networks and synthesizing their results simply. The experimental results show that the performance of the neural network ensemble is better than that of the best individual neural networks. This phenomenon has caused widespread concerns among researchers. The Boosting algorithm (Schapire 1990), one of the ensemble learning methods, proves for the first time that a weak learning algorithm can be promoted to a strong learning algorithm by ensemble even if it is only slightly better than the random guesses. This work has strongly promoted the development of ensemble learning, making it a research hotspot for machine learning (Freund 1995). However, there is a major drawback in solving practical problems using it. The lower limit of the learning rate of the weak learner must be required in advance, which is very difficult to achieve in solving practical problems.

For this problem, a new improved adaptive Boosting algorithm named AdaBoost was proposed (Freund and Schapire 1997). An SVM ensemble algorithm based on AdaBoost was given (Kim et al. 2003), displaying that SVM ensemble can greatly improve the classification accuracy of IRIS(Incorporated Research Institutions for Seismology) data, handwriting recognition and fraud detection data compared with the SVM algorithm.

Suppose the training sample \( S=\{X,Y\}=\{(x_1,y_1),\ldots ,(x_i,y_i),\ldots ,(x_n,y_n)\} \), \( x_i\in X \), \( y_i\in Y \), where X represents the feature set space and Y is the corresponding feature values. D is defined as a series of weight distributions on the input data. Initial weights of the input data are defined as \( D_1(X)=1/n\).In AdaBoost, T represents the looping number. The base classifier \( h_t(t=1, \ldots , i, \ldots , T) \) is repeatedly produced by a series of loops. In every loop, \( D_t(X) \), the input data weight, is adjusted every time according to the error result, and then the next iteration is performed.The goal of \( h_t \) is to minimize the prediction error rate \(e_t\) of the new sample set under the D distribution. After T iterations, these base classifiers are combined into one strong classifier H with a weighted majority vote.

The error rate \( e_t \) of the base classifier \( h_t \) is calculated as:

$$\begin{aligned} e_t=\sum _{h_i(x_i)\ne y_i}D_t(X). \end{aligned}$$
(13)

The weighting coefficient \( \beta _t \) in the base classifier \( h_t \) is:

$$\begin{aligned} \beta _t=\frac{1}{2}ln\left( \frac{1-e_t}{e_t}\right) . \end{aligned}$$
(14)

Update the weight of the sample using the error rate \( e_t \):

$$\begin{aligned} D_{t+1}(X)=\frac{D_t(i)}{Z_t}\times \left\{ \begin{array}{lr} e^{-\beta _t}, &{} h_t(x_i)=y_i\\ e^{\beta _t}, &{} h_t(x_i)\ne y_i \end{array} \right. =\frac{D_t(X)e^{(-\beta _iy_ih_i(x_i))}}{Z_t}, \end{aligned}$$
(15)

where \( Z_t \) is the normalization factor:

$$\begin{aligned} Z_t=\sum _iD_t(X)e^{(-\beta _iy_ih_i(x_i))}. \end{aligned}$$
(16)

The illustration of the AdaBoost algorithm is shown in Fig. 1, by which we can find the basic thought of this classical method. The training data for ensemble learning are expressed as \(\{ (x_1, y_1, u_1),\ldots ,(x_i, y_i, u_i),\ldots ,(x_n, y_n, u_n)\}\), \( {h_1(x),\ldots ,h_t(x),\ldots ,h_T(x)} \) is the t-th \( (t=1,\ldots ,i,\ldots ,T) \) base classifiers in the AdaBoost algorithm, and \( \alpha _t \) is the weight of each base classifier \( h_t \).

Fig. 1
figure 1

The illustration of the AdaBoost algorithm

4 SVM AdaBoost integration in medical examination scheduling

In this section, the SVM AdaBoost algorithm is introduced to solve the problem of time prediction for scheduling outpatient medical examination, an ensemble regression prediction model of which is proposed.

4.1 SVM AdaBoost model construction

In the prediction process, because the SVM algorithm has a high time complexity when the training sample set is very large, some fast approximation algorithms are needed to improve the regression prediction accuracy and to reduce the training complexity and time consumption. Meanwhile, with the increasing scale of the training data, the difficulty of obtaining the optimal combination of kernel parameters also increases, resulting in the deterioration of the generalization ability of SVM. To avoid it in scheduling medical examination prediction using SVM, this paper constructs an SVM ensemble predictor based on AdaBoost with the advantages of ensemble learning so that the prediction accuracy of regression can be improved effectively.

By analyzing the influencing factors of the outpatient appointment in medical examination, this paper selects the 1-week appointment time randomly as the feature input vector in the AdaBoost ensemble. The corresponding schedule time is used as the output vector. In order to achieve better training and prediction effects, the date data in the training data are converted to continuous data values in [0, 1], by which the training and prediction effects of SVM are ensured.

The SVM ensemble algorithm model based on AdaBoost for scheduling medical examination is shown in Fig. 2. The base prediction tool in the model is SVM and the network search algorithm is applied to optimize the SVM parameters.

Through the weighting function, training data collected on the original medical data set generate the t-th (t = 1, 2,..., T) basis prediction model. The weight of the training samples is adjusted in the base classifier hi(x) after each training iteration. If the error is large, the sample weight will be increased, but the sample weight will be reduced if the prediction error is small. With the progressing of iteration, samples with large errors play a more important role in the next training round. Finally, all the T base classifiers are linearly combined into a strong SVM prediction model by the AdaBoost ensemble algorithm.

Fig. 2
figure 2

The SVM ensemble model of the prediction system for medical examination

4.2 The procedure of the proposed method

The procedure of scheduling medical examination is given in Table 1

Table 1 The procedure of scheduling medical examination

5 Experimental results and analyses

5.1 Experimental data

The data set used in this experiment comes from the medical examination appointment database in a hospital medical department. In this database, the medical examination data of the outpatients’ daily appointments are recorded. One week’s data samples were randomly selected from the database automatically collected and managed by the computer. There are about 7300 data records. Each data record contains 10 attributes such as the patient card number, the project code and name, the inspection location, the opening time, the appointment time, the execution time, the executive department, etc.

Some numerical properties for sample data is given in Table 2. For the outpatient medical examination scheduling, the prediction target is that the difference between the predicted execution time and the actual one should be as small as possible, and follows the rule that the predicted execution time is later than the outpatient appointment time.

As can be seen from the meaning of each attribute, these numerical attributes such as the project time, the appointment time and the execution time are related to the predicted target. Therefore, in the process of algorithm modeling, character data such as patient card numbers, project codes, project names and execution items are ignored so that the dimensions of data are reduced.

Table 2 Some numerical properties of medical examination data samples

5.2 Data preprocessing

As shown in Table 2, the project time, the appointment time and the execution time of the outpatients are date data including four levels: dates, hours, minutes and seconds. From the given data set, it is not difficult to find that the execution time and the appointment time of the outpatient medical examination are same on the date level, i.e., the outpatients take the medical examination at least on the same day with the appointment time. Therefore, according to the prediction target, the influence of dates in the SVM ensemble algorithm modeling is ignored in the data preprocessing, but the difference between the appointment time and the execution time will be calculated.

In order to improve the accuracy of the final prediction, the sample data set was normalized for preprocessing. In this paper, the maximum and minimum normalization methods are used to normalize the corresponding attributes of hours, minutes and seconds to [0,1]. The calculation method is:

$$\begin{aligned} x_{ norm}=\frac{x_i-x_{ min }}{x_{ max} }-x_{ min }. \end{aligned}$$
(17)

According to Eq. (17), the attribute values of hours, minutes and seconds can be normalized to [0, 1], e.g., if the hour in the appointment time for a sample is 12, then: \( x_{ norm}= \frac{12-0}{24-0}=0.5\).

5.3 Prediction using the SVM ensemble algorithm

The following experiments were made with the Intel i7-5600U CPU, 12 GB memory, Windows 7 (64 bit) and JAVA. The training data are 7258 records in a week and the random testing data are 32 records from the hospital database. SVM was performed using the radial basis kernel function. The experimental results of testing data by the SVM predictor and the SVM ensemble predictor respectively established according to the optimal parameters by the SVM method are compared and shown in Fig. 3, in which the abscissa represents the predicted 31-data record, and the ordinate indicates the predicted execution time values (24 h clock). For better observation, the prediction time is reprocessed after modeling and converted to a uniform time format. It is obviously seen that the proposed SVM ensemble method can predict continuous data well even with few training data and the prediction of the proposed method is better than that of the SVM prediction since the result of the proposed method is closer to the curves of the testing data.

Fig. 3
figure 3

Prediction comparisons of the testing data, the SVM and the SVM ensemble

6 Conclusions

For the outpatients’ appointments of medical examination scheduling, this paper gives a new SVM ensemble algorithm based on AdaBoost. In this algorithm, a small part of training data first is randomly sampled into subsets of the training set in the database, then some SVM predictors are produced by weighting the data subsets through cycles, and finally these SVM predictors are linearly combined as the ensemble outputs. The ensemble algorithm can reduce the time complexity of SVM in big data samples, and ensure the validity of the regression prediction even if the data samples are small. Using the real data collected from the hospital medical examination scheduling database, the experiments show that the regression prediction effect of the ensemble learning method is better than that of the SVM regression prediction. This research has the effect of improving the efficiency of hospital management and promoting the application of SVM in real medical fields .