1 Introduction

Surface reflection, also known as specular reflection, always leads to the presence of highlights in images and is widely involved topic in various research areas such as computer vision, image processing and pattern recognition. Many algorithms have been developed to detect and remove surface reflection during the previous decades. Early in the 90’s, researchers had tried to remove surface reflection by using polarizer filters [1, 2]. Based on the Dichromatic Reflection Model (DRM) proposed by Shafer [3], which define reflection as a linear combination of a surface reflection component and a body reflection component, Tan [4] et al. develop a diffuse-to-specular mechanism to separate surface reflection from a single image. Unfortunately, all these algorithms perform poorly on skin images due to the multi-layer structure of the human skin and complex interaction of light beneath skin.

1.1 Related Work

Considered the largest organ of the human body, skin is actually a turbid media which consists of several layers. The outermost skin layer is non-flat and definitely cannot be treated as a Lambertian surface. To simulate and obtain the surface reflection of the skin, many models as well as measurements have been developed.

Surface Reflection Models:

As the most popular reflection models, BRDF (Bidirectional Reflectance Distribution Function) and BSSRDF (Bidirectional Surface Scattering Reflectance Distribution Function) describe how light travels when it hits a surface. BSSRD is proven to be a more reasonable choice for the translucent human skin, which involves significant subsurface scattering [5, 6]. In addition, many multi-layer models [7,8,9,10,11], including Monto Carlo simulations [9], the K-M theory [10], and the dipole diffusion theory [11] have been developed to accurately simulate possible skin reflections and interactions between incident light and skin.

Surface Reflection Measurements:

Generally, it is difficult to measure the parameters of each skin layer despite fine theoretical foundation provided by the BSSRDF. Therefore, researchers have set up 3D environments to measure surface reflection [12, 13]. More specifically, Angelopoulou [14] reveals that skin reflectance exhibit a local “W” pattern regardless of race and gender by using a high resolution, high accuracy spectrograph. We call the measurement method above 3D measurement and 1D measurement respectively. In the cosmetic industry, exclusive equipments are developed to carry out image difference based 2D measurement to quantitatively evaluate the effect of product [15]. In summary, measuring skin surface reflection seems difficult without help from special instruments.

In this paper, a method which does not require hardware, segmentation and any prior information is proposed to globally estimate skin surface reflection component using only a color image, as shown in Fig. 1. We construct a complete skin imaging model by combining the Lambert Beer law with the DRM, followed by measuring independent pigment concentration distributions to separate skin surface reflection, which can then be decomposed into its corresponding RGB components, as shown in bottom of Fig. 1.

Fig. 1.
figure 1

The surface reflection component is separated based on a single image. (Top): flowchart of proposed algorithm. (Bottom) (From left to right): Input image, surface reflection component obtained by proposed algorithm, resultant image after removing surface reflection, R-channel component, G-channel component and B-channel component of surface reflection. (Color figure online)

2 Skin Reflection Model

2.1 Sub Surface Reflection Model

The traditional Lambert Beer law is usually defined as:

$$ \text{A} = - { \log }({\text{I}}/{\text{I}}_{0} ) = \varepsilon \, * \,l\, * \,c $$
(1)

Where A is absorbance, and can also be written as a product of extinction coefficient ε, path length l and absorbing species concentration c. I and I0 are the power of transmitted illumination and incident illumination respectively. Incorporating Lambert Beer law with Dawson’s experiments [16]: skin absorbance can be represented by the linear combination of dominant pigments (melanin and hemoglobin) absorbance, skin absorbance can be denoted as follows taking the skin baseline absorption and the residual pigment contribution A0 into account.

$$ \text{A} = \varepsilon_{\text{m}} l_{\text{m}} c_{\text{m}} \, + \,\varepsilon_{\text{h}} l_{\text{h}} c_{\text{h}} + \text{A}_{0} $$
(2)

Where A is the skin absorbance. Subscript m and h indicate the melanin and hemoglobin respectively. c and l here indicate pigment concentration and skin layer thickness, respectively. Equation (2) shows high consistent with the additive of Lambert Beer law. We now have following equation after taking logarithm:

$$ \text{I} = - { \exp }\{ \varepsilon_{\text{m}} l_{\text{m}} c_{\text{m}} + \varepsilon_{\text{h}} l_{\text{h}} c_{\text{h}} + {\text{A}}_{0} \} \, * \,{\text{I}}_{0} $$
(3)

Given the camera sensor spectral response function \( \Psi \), the pixel value Pb at position (x, y) in the digital image is given by

$$ P_{\text{b}} (x,y) = k\int {{\text{I}}\uppsi(\lambda ){\text{d}}\lambda } = - k\int {{ \exp }\{ \varepsilon_{\text{m}} l_{\text{m}} c_{\text{m}} (x,y) + \varepsilon_{\text{h}} l_{\text{h}} c_{\text{h}} (x,y) + {\text{A}}_{0} (\lambda )\} \, * \,{\text{I}}_{0} \, * \,\uppsi(\lambda ){\text{d}}\lambda } $$
(4)

Here k is a camera gain constant. Under reasonable assumption, camera sensors can be considered as narrowband and treated as a delta function [17, 18]. Therefore, we have (x, y and λ are omitted for simplicity):

$$ P_{\text{b}} = - k\,{ \exp }\{ v_{\text{m}} c_{\text{m}} + v_{\text{h}} c_{\text{h}} + {\text{A}}_{0} \} \, * \,{\text{I}}_{0} $$
(5)

Where v = l * ε. Generally, light in real word can be separated into different planckian type illuminators and planckian SPD (Spectral Power Distribution) provides good approximation of common illumination, such as sunset light, halogen lamps and tungsten lamps. Incident illumination is then written as simplified planckian radiator:

$$ {\text{I}}_{0} \, \approx \,{\text{g}}_{1} \lambda^{ - 5} { \exp }( - {\text{g}}_{2} /{\text{T}}\lambda ) $$
(6)

Where g1 and g2 are constant, the term T is called correlated colour temperature. Taking logarithm after combining (5) and (6) arrives at:

$$ - { \log }P_{\text{b}} = K + \{ v_{\text{m}} c_{\text{m}} + v_{\text{h}} c_{\text{h}} + {\text{A}}_{0} \} $$
(7)

Where K(λ) = logk − 5logλ + logg1 −g2/Tλ. More specifically, the sub surface model for every pixel in i (i = RGB) channel of skin image can be given by:

$$ - { \log }P_{\text{b}} (x,y,i) = K(i) + \{ v_{\text{m}} (i)c_{\text{m}} (x,y) + v_{\text{h}} (i)c_{\text{h}} (x,y) + {\text{A}}_{0} (i)\} $$
(8)

2.2 Complete Reflection Model

Small portion of incident light is reflected on surface and the rest penetrates into the sub surface. Most of visible light that reaches subcutis if reflected back to the upper layers due to the presence of fat after experiencing absorption, scattering and transmission in epidermis and dermis.

As illustrated in Fig. 2, the complete skin reflection is composed of surface reflection and body reflection. Considering as a quasi-transparent medium, skin can be well modeled by the Dichromatic Reflection Model. We then define pixel P in skin image as follows taking surface reflection pixel Ps into account:

Fig. 2.
figure 2

The complete reflection mode of human skin.

$$ P(x,y,\lambda ) = m_{\text{b}} (x,y)P_{\text{b}} (\lambda ) + m_{\text{s}} (x,y)P_{\text{s}} (\lambda ) $$
(9)

Where mb and ms are the geometrical factors which encode information about incident angle, shadow etc. Plugging Eqs. (9) into (8) gives:

$$ - { \log }\,P = K + \{ v_{\text{m}} c_{\text{m}} + v_{\text{h}} c_{\text{h}} + {\text{A}}_{0} \} \,{ + }\,{ \log }\{ (1 - fm_{\text{s}} )/m_{\text{b}} \} $$
(10)

Where f(x, y, i) = Ps(x, y, i)/P (x, y, i) is defined to be surface reflectivity. K in above equation can be set to zero by normalizing each channel to zero-mean. Therefore, defining Z (x, y, i) = log P(x, y, i), we can rewrite Eq. (11) as follows:

$$ - \overline{\text{Z}} = \{ v_{\text{m}} c_{\text{m}} + v_{\text{h}} c_{\text{h}} + {\text{A}}_{0} \} + { \log }\{ (1 - fm_{\text{s}} )/m_{\text{b}} \} $$
(11)

3 Extracting Pigment Concentration Distribution

Potential inconsistent illumination may lead to unexpected error. We then use the channel difference (B-R) and (G-R) to minimize the log item in Eq. (11) to build 2-D mixture signals with noise:

$$ \Delta \varvec{Z} = [\Delta \varvec{V}_{{\mathbf{m}}} \text{ }\Delta {\mathbf{V}}_{\text{h}} \text{ }\Delta N][{\mathbf{c}}_{m} \text{ }{\mathbf{c}}_{h} \text{ }{\mathbf{1}}]^{\text{T}} $$
(12)

Where,

$$ \begin{array}{*{20}l} {\Delta \varvec{Z} = [\overline{Z} ({\text{R}})/\overline{Z} ({\text{B}}),\text{ }\overline{Z} ({\text{R}})/\overline{Z} ({\text{G}})]^{\text{T}} } \hfill \\ {\Delta {\mathbf{V}}_{\text{m}} = [v_{\text{m}} ({\text{B}}) - v_{\text{m}} ({\text{R}}),\text{ }v_{\text{m}} ({\text{G}}) - v_{\text{m}} ({\text{R}})]^{\text{T}} } \hfill \\ {\Delta {\mathbf{V}}_{\text{h}} = [v_{\text{h}} ({\text{B}}) - v_{\text{h}} ({\text{R}}),\text{ }v_{\text{h}} ({\text{G}}) - v_{\text{h}} ({\text{R}})]^{\text{T}} } \hfill \\ {\Delta \varvec{N} = [{ \log }\frac{{1 - f(R)m_{s} }}{{1 - f(B)m_{s} }},\,{ \log }\frac{{1 - f(R)m_{s} }}{{1 - f(G)m_{s} }}]^{\text{T}} } \hfill \\ \end{array} $$

∆A0 is negligible and eliminated since research shows A0 make marginal contribution to skin absorption [16, 19]. Pigment Concentration Distribution (PCD) c is then obtained by carrying out ICA algorithm based on Eq. 12, with the knowledge that the presence of melanin in epidermis and hemoglobin in dermis are mutually independent [18, 20]. Recalling the definition of v in Sect. 2.1, every member in Mixture Vector should be positive given the fact that the extinction coefficient ε of both melanin and hemoglobin are relatively smaller in RED wave band over visible light spectrum. The fact is supported by numerous in-vivo work [19, 21, 22]. Experiments show the skin images containing dominant non-skin region (background, hair etc.) always lead to invalid Mixture Vector. Here we show the visualized PCD map in Fig. 3.

Fig. 3.
figure 3

Visualized PCD map. Left: Input image, Center: Melanin PCD map. Right: Hemoglobin PCD map. Notice that freckles in cheek only appear in melanin PCD map, and lip region shows highest value in hemoglobin PCD map. Refer to Sect. 3 in [23] to find alternative ways for verification. (Color figure online)

4 Surface Reflection Component

4.1 Solution of Mixture Vector

Considered the items A0 and log{(1 − fms)/mb} in Eq. 11 are wavelength dependent, as well as maybe be additionally location dependent, we design neighboring difference in single channel to minimize dependent effect of wavelength and location to solve v:

$$ \Delta P(x,y) = \overline{Z} (x,y) - \overline{Z} (x + 1,y) = v_{\text{m}} \Delta c_{\text{m}} + v_{\text{h}} \Delta c_{\text{h}} + RES $$
(13)

Where ∆c = c(x + 1, y) − c(x, y), RES represent the combined residual differential value of A0 and f etc., whose value could be close to zero under relaxed assumption that incident illumination at neighboring pixels are quite similar at same wavelength. So far, v can be given by typical linear regression, followed by vector normalization. Our experiments have supported our assumption: RES is always either close or equal to zero.

4.2 Reflectivity Calculation

Equation 11 can be rewritten as follows:

$$ F = - Z - \{ v_{\text{m}} c_{\text{m}} + v_{\text{h}} c_{\text{h}} \} $$
(14)

Where F(i) = A0(i) + log{(1 − f(i)ms)/mb}, we then have:

$$ f = (1 - m_{\text{b}} \,{ \exp }(F - {\text{A}}_{0} (i)))/m_{\text{s}} $$
(15)

A0 is negligible compared with dominant pigments absorption and can be reasonably treated as constant in each channel respectively and removed by zero mean normalization. Using \( \bar{F} \) to represent normalized F, and surface reflectivity f can be simply estimated by:

$$ f \propto - m_{\text{b}} /m_{\text{s}} \,{ \exp }(\overline{F} ) $$
(16)

We take the extreme situation and approximately eliminate the ratio mb/ms by unit standard deviation normalization. Now, it’s quite straightforward to give f since P, v and c are already known after combining Eqs. 14 and 16. Following flowchart demonstrate the detailed step to calculate reflectivity f and final SRC.

figure a

5 Experiments and Discussion

5.1 Experimental Setup

According to Sect. 2.2, Surface Reflection Component (SRC) can be obtained if both overall reflection and Body Reflection Component (BRC) are known. It is impossible to capture BRC by household digital camera. However, polarizer is a well option to separate surface reflection under polarized illumination with a pair of orthogonally-polarized filters placed over the light source and on the camera lens respectively. Cross polarization technology is widely used in industry to capture specular-free skin image. We hereby denote crossed polarized image as XPOL, parallel polarized image as PPOL and define SURF as deference between PPOL and XPOL [14, 23], which is then treated as ground truth of surface reflection. TRUVU®, device for imaging skin for cosmetic analysis of the skin, developed by Johnson & Johnson, is used to capture PPOL and XPOL. Figure 4 shows the final experimental setup.

Fig. 4.
figure 4

Schematic diagram of defining ground truth of SRC. Two light sources are placed at both sides of the object with fixed angle. The camera captures PPOL and XPOL at parallel and perpendicular polarization status, according to change of polarization orientation.

5.2 Algorithm Verification

The original image captured by our system is 24-bit, color image with resolution 2592 × 2888. Skin-dominant region is select as ROI to analyze since non-skin region (background, eyebrow, eyelash, hair etc.) are not subject to Lambert-beer law and DRM. It is easy to extend in the case of original image as input after locating skin regions. The SRC in Fig. 5 show the estimated reflectance distribution is consistent with the ground truth except for non-skin regions. In Fig. 5, columns (d–f) show the triple channel component of SRC. The perceivable difference of SRC in RGB channels indicates that surface reflection is wave dependent. However, the proposed algorithm still cannot well estimate the chromaticity of illumination, according to the obvious color difference between SRC and the ground truth.

Fig. 5.
figure 5

Algorithm verification by comparing with ground truth. (a) Input image. (b) SRC obtained by our algorithm. (c) Ground truth SURF, difference between PPOL and XPOL. (df) R, G and B channel component of SRC. Notice each SRC of R, G and B channel is different. (Color figure online)

Figure 6 demonstrates another example of application in cosmetic industry. To test how the SRC varies after applying cosmetic product which helps skin brighten, two images are taken before and after applying skin care product under identical illumination. Obvious increase of surface reflection can be observed on cheek. Therefore, we can design image-based tools to quantitatively evaluate cosmetic product and, more specifically, accurately locate skin region which takes significant effect after application.

Fig. 6.
figure 6

Experiment shows how the SRC varies before and after cosmetic product. (a) Image taken before applying skin care product. (b) Image taken after applying skin care product in 8 weeks. (cd) SRC of a and b respectively. (Color figure online)

6 Conclusions

We have introduced a method to extract the skin surface reflection using only a single image. This is achieved by first introducing an inter-channel quotient in log space to estimate the PCD maps. Next, we used neighboring pixel difference to solve mixture vectors. Finally we determined surface reflectivity in every color channel to give the overall surface reflection component. The experiments in Sect. 5 show high correlation between estimated results and ground truth. However, the proposed algorithm cannot well estimate the illumination chromaticity yet. In addition, shadow and non-skin regions in estimated SRC remain little consistent with real image. Recently, we are planning to work on the above problems.