Keywords

1 Introduction

It is commonly recognized that a suitable shoe can be obtained by matching the shape of the shoe to the shape of the foot [2]. Therefore, studies of the gender differences in foot shape are essential to the proper design of men’s and women’s shoes. Traditionally, women’s sport shoes have been made using a small version of a men’s shoe last with all dimensions proportionally scaled according to foot length [2]. However, if women’s feet differ in shape from men’s feet, it is an inappropriate model for a women’s shoe last and could lead to improper shoes in women [3]. In 1993, the American Orthopaedic Foot and Ankle Society’s Women’s Shoe Survey [4] reported that 88% of the healthy women surveyed were wearing shoes smaller than their feet (1.2 cm average in length), 80% of the women surveyed said they had foot pain while wearing shoes, and 76% had some sort of foot deformity [5]. These statistics advocate a greater focus on the shape and fit of women’s shoes. So far, most attention has focused on the design and fit of dress shoes and the negative impact of high heels. However, although women are increasingly involved in recreational activities and are increasingly aware that sports have the potential to cause special harm to women, little attention has been paid to matching women’s shoes to feet. Other research referred to foot anthropometric study, please see literatures [1, 4, 6]. Here, we give some suggestions on the correct design of women’s sports shoes [7] based on the study of gender differences in foot shape.

The samples used in this study are from the Chinese adult body size database, which contains 4000 human body samples, of which 50% are male and 50% are female. The subjects’ ages range from 18 to 60 years old. There are five variables related to the foot shape in this paper: foot length, foot breadth, ankle circumference, medial malleolus height and foot oblique width.

2 Basic Statistical Analysis

2.1 Numerical Analysis

Table 1 is the total mean of each variable for all women and the mean of all age groups of each variable for women.

Table 1. Mean value table for women’ foot shape

Table 2 is the total mean of each variable for all men and the mean of all age groups of each variable for men.

Table 2. Mean value table for men’ foot shape

From Tables 1 and 2 we can conclude that: (1) For each variable, no matter men or women, the mean for each age group fluctuated around the mean. (2) Overall, the mean value of the five variables for men are bigger than women.

2.2 Graphical Analysis

Line Graph.

Figure 1 shows the change of the mean in each variable for males and females of all age groups. A line graph can visually show the trend of the average of each variable in each age group.

Fig. 1.
figure 1

Line chart of the mean of each variable for men and women of all age. Note: (1) Women are represented by red lines, and men are represented by blue lines. (2) On the x-axis, 1 represents a population of 18 to 20 years old, 2 represents a population of 21 to 30 years old, 3 represents a population of 31 to 40 years old, 4 represents a population aged 41 to 50 years, and 5 represents an age group of 51 and 60 (Color figure online).

From Fig. 1, we can obtain:

  1. (1)

    Men. With the increase of age, the foot length of men increases first and then decreases. In the 20–30 years old period, the foot length of men reaches its maximum value.

    Women. Female foot length remains stable until the age of 40, then decreases between the age of 40 and 50, and then increases. At the age of 40 or 50, the length of a woman’s foot reaches the minimum.

  2. (2)

    Men. The male foot breadth increases before the age of 30 and does not change much after the age of 30.

    Women. Female foot breadth rises before the age of 50, peaks around 50 years old, and falls after 50 years old.

  3. (3)

    Men. With the increase of age, the ankle circumference of male decreases at first, then increases and then decreases, and the fluctuation is very large.

    Women. As the age increases, the female’s ankle circumference decreases at first and then increases, and finally tends to be stable. The drop point is 20 to 30 years old.

  4. (4)

    Men. As the age increases, the medial malleolus height of male rises at first, then falls and finally rises. The overall change is relatively flat.

    Women. The medial malleolus height of the female falls between the ages of 40 and 50, while it remained constant in other age groups.

  5. (5)

    Men. With the increase of age, the foot oblique width of the male foot rises first and then remains flat after the age of 30.

    Women. With the increase of age, the oblique width of the female foot rose first and then decreased, and reached its highest point at the age of 30–40.

Conclusion: (1) There are significant differences in the characteristics of the foot between men and women. (2) There is also a difference in foot shape data of different age groups in the same gender.

Histogram.

Figures 2 and 3 show histograms of each variable for men and women. The histogram can show the distribution of each variable.

Fig. 2.
figure 2

Histogram of each variable for men

Fig. 3.
figure 3

Histogram of each variable for women

From Figs. 2 and 3 we can conclude that:

  1. 1.

    Regardless of male or female’s foot shape data, the five variables of the foot data are approximately following the normal distribution.

  2. 2.

    Mean and standard deviation. In the five variables, except for ankle circumference’ standard deviation, the mean value and standard deviation of the male were higher than those of the female. It shows that the degree of change in female foot shape is relatively smaller than that of males.

Empirical CDF (Cumulative Distribution Function).

Figures 4, 5, 6, 7 and 8 show the empirical CDF plots for each variable with different gender. From the plot of Empirical CDF, we can find that the empirical CDF of five variables are different for women and men. For the foot length, foot breadth, medial malleolus height and foot oblique width, 80% quantile of different genders are significantly different, and the difference of 80% quantile in the ankle circumference between men and women is relatively small. Moreover, 80% quantile of the male were higher than that of the female for each variable.

Fig. 4.
figure 4

Empirical CDF of foot length

Fig. 5.
figure 5

Empirical CDF of foot breadth

Fig. 6.
figure 6

Empirical CDF of ankle circumference

Fig. 7.
figure 7

Empirical CDF of medial malleolus height

Fig. 8.
figure 8

Empirical CDF of foot oblique width

2.3 Correlation Matrix

Of course, correlation maybe exists within these five variables. So let’s study the correlation of foot shape data. The correlation matrix of the five variables of human foot shape data is as follows:

From Table 3, we can find that positive correlations exist between the five variables of the foot data. There is a strong positive correlation between foot breadth and foot oblique width. There are also positive correlations between foot length and foot breadth, foot length and foot oblique width, but they are not too strong.

Table 3. Correlation matrix of variables for human’ foot

3 Univariate Analysis

We performed a univariate t-test on five variables for men and women. The test results are shown in Table 4. Table 4 shows the difference estimates of male and female for the five variables, the 95% confidence interval for the difference, and the P-value of testing, etc.

Table 4. Two-sample T-test and CI between men and women

Not surprisingly, each of the 5 variables is both significantly different between men and women. The difference of foot length between male and female is the biggest, followed by medial malleolus height, foot oblique width and Foot breadth; the difference of ankle circumference between male and female is the least.

Furthermore, we compared the density distribution of foot length between men and women. The histogram of foot length is shown in Fig. 9.

Fig. 9.
figure 9

Density distribution of foot length for men and women

Figure 9 shows that on average, the foot length of men is about 18.5 mm longer than that of women. It also shows that the data of foot length are more concentrated in men.

4 Linear Discriminant Analysis (LDA)

Here, we use Linear Discriminant Analysis (LDA) method [8], which is a generalization of Fisher’s linear discriminant, a method used in statistics, pattern recognition and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events. The basic idea is projection. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification. Let us introduce LDA for two groups briefly. Assume that random samples of \( N_{1} \) and \( N_{2} \) observation vectors are drawn independently from respective multi-normal populations with mean vectors \( \mu_{1} \) and \( \mu_{2} \) and a common covariance matrix \( \sum \). We wish to construct a linear compound or index for summarizing observations from groups on one-dimension scale that discriminates between the populations by some measure of maximum separation. If \( \bar{x}_{1} \) and \( \bar{x}_{2} \) are the sample mean vectors and S is the pooled estimate of \( \sum \), we shall determine the coefficient vector \( {\text{a}} \) of the index \( a^{\prime}x \) as that which gives the greatest squared critical ration

$$ t^{2} (a) = \frac{{\left[ {a^{{\prime }} \left( {\bar{x}_{1} - \bar{x}_{2} } \right)} \right]^{2} N_{1} N_{2} /\left( {N_{1} + N_{2} } \right)}}{{a^{{\prime }} Sa}} $$
(1)

or, equivalently, which maximizes the absolute difference \( \left| {a^{{\prime }} \left( {\bar{x}_{1} - \bar{x}_{2} } \right)} \right| \) in the average values of the index for two groups subject to the constraint \( a^{{\prime }} Sa = 1 \).

Resolve the constrained optimization problem, we have

$$ a = S^{ - 1} \left( {\bar{x}_{1} - \bar{x}_{2} } \right) $$
(2)

and the linear discriminant function is

$$ Y = \left( {\bar{x}_{1} - \bar{x}_{2} } \right)S^{ - 1} x . $$
(3)

Nowadays, many statistical analysis packages contain programs for discrimination analysis, such as R, SPSS, Minitab, etc. Here we use Minitab to conduct discrimination analysis. When the values of the data are used to predict gender, the correct rate of the classifier is 91.5% (90.7% of the time for men and 92.3% of the time for women) (Table 5).

Table 5. Summary of classification

We also establish a discriminant function and the results are shown in Table 6. It has been observed that the foot breadth, the foot length and medial malleolus height have a greater influence on the gender discrimination, and the influence of the ankle circumference and the foot oblique width is less.

Table 6. Linear discriminant function for groups

5 Discussion and Conclusion

The results of this study suggest that we can use foot size to distinguish men and women reliably. This discovery has certain guiding significance to the design of male and female sports shoes. Our analysis of foot data shows that there are significant differences in the shape of feet between men and women. These findings suggest that women’s shoes should not be just a smaller version of men’s shoes. At the same time, the foot shape of different age groups are slightly different. This paper also has some limitations. One limitation of the present paper is that the variables used may not fully represent the entire set of three dimensional shape differences that exist between men and women. In some cases, in order to make sports shoes more compatible with the foot, other measurements such as foot thickness can be considered. Future studies should expand the set of measurements, focusing on areas identified as functionally important in this and other studies.