Keywords

1 Background

Human observers can recognize material property at a glance through our sensory organ. Without touching materials, we can tell whether they would feel hard or soft, rough or smooth, wet or dry.

The material perception is said as a perceptual phenomenon of feeling or sensation that our brain perceives from optical image projected onto retina. Though, it’s hard to untangle what information of the retinal image stimulates the visual cortex and how it induces the material feeling in our brain. The mechanism of INNER VSION in brain is still a black box at present [1].

As a framework for material perception, Tsumura initiated the skin color appearance and proposed the concept of appearance delivering system [2].

In Brain Information Science research on SHITSUKAN by MEXT in Japan, the first stage (2010–2014, led by Dr. H. Komatsu) has just finished and the second stage (2015–2019 led by Dr. S. Nishida) stepped forward into “multi-dimensional” material perception and now is approaching to the final goal.

In spite of the complexity in material appearance mechanism, human sensations such as “gloss/mat”, “transparent/translucent”, “metal/cloth” are controllable by an intuitive but a smart technique.

For instance, Motoyoshi and Nishida et al. [3] noticed the “gloss” perception appears when the luminance histogram is skewed. If it’s stretched smoothly to the higher luminance, the object looks “glossy” but looks “mat”, if compressed to the lower.

Sawayama and Nishida [4] developed “wet” filter by a combination of exponent-shaped TRC and boosted color saturation. It’s very interesting any “skew” in the image features induces a sensational material perception. The finding of “skew” effect seems heuristic and intuitive. However, the mechanism why and how such sensations as “gloss” or “wet” are activated by the “skew” effect in INNER VISION is not still untangled yet.

On the other hand, many R&D for practical applications are making steady progresses in private enterprises. As a typical successful example, a specular reflection control algorithm based on BRDF (Bidirectional Reflectance Distribution Function) is implemented in LSI chip and mounted on next generation 4K HD TV “REGZA” [5].

2 Color Transfer Model Between Images

Since the material perceptions such as gloss or clarity are related to a variety of factors [6], it’s hard to specify the cause of perceptual feeling to a single factor. Nevertheless, trials on material or textual appearances transfer between CG images [7] or 3D objects [8] are reported. Especially, color appearance plays an important role in the material perception. The color transfer model [9] tried to change the color atmosphere of source scene A into that of target scene B, where the clustered color distribution of A is roughly matched with that of B. There, the use of vision-based lαβ color space [10] attracted interest.

2.1 lαβ Color Transfer Model

The lαβ is known as an orthogonal luminance-chrominance color space simply transformed from RGB by the following Step1 and Step2 and the color distribution of source image is changed to match with that of target (reference) image by the scaling process in Step3 and the color atmosphere of target is transferred to the source via the inverse transform in Step4 as follows

  • Step1: RGB to LMS cone response transform

    $$ \left[ {\begin{array}{*{20}l} L \hfill \\ M \hfill \\ S \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} { 0. 3 8 1} \hfill & { 0. 5 7 8} \hfill & { 0. 0 4 0} \hfill \\ { 0. 1 9 7} \hfill & { 0. 7 2 4} \hfill & { 0. 0 7 8} \hfill \\ { 0. 0 2 4} \hfill & { 0. 1 2 9} \hfill & { 0. 8 4 4} \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} R \hfill \\ G \hfill \\ B \hfill \\ \end{array} } \right] $$
    (1)
  • Step2: LMS to lαβ transform with orthogonal luminance l and chrominance αβ

    $$ \left[ {\begin{array}{*{20}l} l \hfill \\ \alpha \hfill \\ \beta \hfill \\ \end{array} } \right] = \left[ {\begin{array}{*{20}l} {{1 \mathord{\left/ {\vphantom {1 {\sqrt 3 }}} \right. \kern-0pt} {\sqrt 3 }}} \hfill & 0 \hfill & 0 \hfill \\ 0 \hfill & {{1 \mathord{\left/ {\vphantom {1 {\sqrt 6 }}} \right. \kern-0pt} {\sqrt 6 }}} \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & {{1 \mathord{\left/ {\vphantom {1 {\sqrt 2 }}} \right. \kern-0pt} {\sqrt 2 }}} \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} 1 \hfill & 1 \hfill & 1 \hfill \\ 1 \hfill & 1 \hfill & { - 2} \hfill \\ 1 \hfill & { - 1} \hfill & 0 \hfill \\ \end{array} } \right]\left[ {\begin{array}{*{20}l} {log\,L} \hfill \\ {log\,M} \hfill \\ {log\,S} \hfill \\ \end{array} } \right] $$
    (2)
  • Step3: Scaling of lαβ around the mean values {\( \bar{l}\,\bar{\alpha }\,\bar{\beta } \)} by the ratio of standarddeviation to make match the color distributions between source and target images.

    $$ \begin{aligned} & l^{{\prime }} = ({{\sigma_{DST}^{l} } \mathord{\left/ {\vphantom {{\sigma_{DST}^{l} } {\sigma_{ORG}^{l} }}} \right. \kern-0pt} {\sigma_{ORG}^{l} }} )\left( {l - \bar{l}} \right) \, \\ & \alpha^{{\prime }} = ({{\sigma_{DST}^{\alpha } } \mathord{\left/ {\vphantom {{\sigma_{DST}^{\alpha } } {\sigma_{ORG}^{\alpha } }}} \right. \kern-0pt} {\sigma_{ORG}^{\alpha } }} )\left( {\alpha - \bar{\alpha }} \right) \\ & \beta^{{\prime }} = ({{\sigma_{DST}^{\beta } } \mathord{\left/ {\vphantom {{\sigma_{DST}^{\beta } } {\sigma_{ORG}^{\beta } }}} \right. \kern-0pt} {\sigma_{ORG}^{\beta } }} )\left( {\beta - \bar{\beta }} \right) \\ \end{aligned} $$
    (3)

    Where, \( \sigma_{ORG}^{l} \) and \( \sigma_{DST}^{\alpha } \) denote the standard deviation of luminance l for the source image and that of chrominance α for the target image, and so on.

  • Step4: Inverse transform \( \left[ {l^{{\prime }} \alpha^{{\prime }} \beta^{{\prime }} } \right] \Rightarrow \left[ {L^{{\prime }} M^{{\prime }} S^{{\prime }} } \right] \Rightarrow \left[ {R^{{\prime }} G^{{\prime }} B^{{\prime }} } \right] \).

    Finally, the scaled \( l^{{\prime }} \alpha^{{\prime }} \beta^{{\prime }} \) source image with the color distribution matched to the target image is displayed on sRGB monitor.

2.2 PCM Color Transfer Model

Prior to lαβ model, the author et al. developed PCM (Principal Component Matchng) method [11, 12] for transferring the color atmosphere from one scene to another as illustrated in Fig. 1. The lαβ model works well between the scenes with color similarity but not for the scenes with color dissimilarity and often fails. While, PCM model works almost stable between the scenes with color dissimilarities and advanced toward automatic scene color interchange [13,14,15].

Fig. 1.
figure 1

Concept of PCM color transfer model between images

In our basic object-to-object PCM model a vector X in a color cluster is projected onto a vector Y in PC space by Hotelling Transform as

$$ \varvec{Y} = \varvec{A}\left( {\varvec{X} -\varvec{\mu}} \right) $$
(4)

Where, \( \varvec{\mu} \) denotes the mean vector and the matrix A is formed by the set of eigen vectors {\( \varvec{e}_{1} \,\varvec{e}_{2} \,\varvec{e}_{3} \)} of covariance matrix ΣX as

$$ \varvec{A} = \left[ {\varvec{e}_{1} \,\varvec{e}_{2} \,\varvec{e}_{3} } \right] $$
(5)

The covariance matrix ΣY of {Y} is diagonalized in terms of A and ΣX with the elements composed of the eigen values {\( \lambda_{1} \,\lambda_{2} \,\lambda_{3} \)} of ΣX as

$$ \varSigma_{\varvec{Y}} = \varvec{A}\left( {\varSigma_{\varvec{X}} } \right)\varvec{A}^{t} = \left[ {\begin{array}{*{20}l} {\lambda_{1} } \hfill & 0 \hfill & 0 \hfill \\ 0 \hfill & {\lambda_{2} } \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & {\lambda_{3} } \hfill \\ \end{array} } \right] $$
(6)

Thus the color vectors in source and target images are mapped to the same PC space and the following equations are formed to make match a source vector YORG to a target vector YDST through the scaling matrix S as follows.

$$ \varvec{Y}_{DST} = \varvec{A}_{DST} (\varvec{X}_{DST} -\varvec{\mu}_{DST} )\;{\text{and}}\;\varvec{Y}_{ORG} = \varvec{A}_{ORG} (\varvec{X}_{ORG} -\varvec{\mu}_{ORG} ) $$
(7)
$$ \varvec{Y}_{DST} = \varvec{S} \cdot \varvec{Y}_{ORG} $$
(8)
$$ \varvec{S} = \left[ {\begin{array}{*{20}l} {\sqrt {\lambda_{ 1DST} /\lambda_{ 1ORG} } } \hfill & 0 \hfill & 0 \hfill \\ 0 \hfill & {\sqrt {\lambda_{ 2DST} /\lambda_{ 2ORG} } } \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & {\sqrt {\lambda_{ 3DST} /\lambda_{ 3ORG} } } \hfill \\ \end{array} } \right] $$
(9)

Solving (7) and (8), we get the following relation between a source color XORG and a target color XDST to be transferred and matched.

$$ \varvec{X}_{DST} -\varvec{\mu}_{DST} = \varvec{M}_{PCM} (\varvec{X}_{ORG} -\varvec{\mu}_{ORG} ) $$
(10)

The matching matrix MPCM is given by

$$ \varvec{M}_{PCM} = \left( {\varvec{A}_{DST}^{ - 1} } \right)\left( \varvec{S} \right)\left( {\varvec{A}_{ORG} } \right) $$
(11)

Where, AORG and ADST denote the eigen matrices for the source color cluster and the target color cluster. In the scaling matrix S, λ1ORG means the 1st eigenvalue of the source and λ2DST the 2nd eigenvalue of the target, etc. These are obtained from each covariance matrix.

In general, the PCM model works better than lαβ even for the scenes with color dissimilarities, because of using the statistical characteristics of covariance matrix.

Figure 2 shows a successful example in both lαβ and PCM models for the images with color similarity. While, in case of Fig. 3, failes to change the color atmosphere of A into that of B due to their color dissimilarities, but works well in PCM.

Fig. 2.
figure 2

Successful example in color transfer between images with color similarity

Fig. 3.
figure 3

Comparison in lαβ vs. PCM models for images with color dissimilarity

3 Color Transfer by Spectral Decomposition of Covariance

Following the lαβ model, a variety of improved or alternative color transfer models have been reported. As a basic drawback in lαβ model, Pitié et al. [16] pointed out that it’s not based on the statistical covariance but only on the mean values and variances in the major lαβ axes. Hence PCM model is better than lαβ because of using the statistical covariance matrix ΣX with the Hotelling transform onto the PC space. At the same time, Pitie suggested to make use of orthogonal spectral decomposition paying the attention to the Hermitian (Self adjoint) property of symmetric matrix ΣX with real eigenvalues.

3.1 Eigen Value Decomposition (EVD) of Covariance

In general, the covariance matrix Σ in a clustered color distribution of image is a real symmetric matrix. The square root of Σ for source and target images is decomposed by eigenvalues as

$$ \varSigma_{ORG}^{1/2} = \varvec{A}_{ORG}^{ - 1} \varvec{D}_{ORG}^{1/2} \varvec{A}_{ORG} \;and \;\varSigma_{DST}^{1/2} = \varvec{A}_{DST}^{ - 1} \varvec{D}_{DST}^{1/2} \varvec{A}_{DST} $$
(12)

AORG and ADST denote the eigen matrices for source and target images. DORG and DDST are given by the diagonal matrices with the entries of their eigen values respectively.

$$ \varvec{D}_{ORG} = \left[ {\begin{array}{*{20}l} {\lambda_{1ORG} } \hfill & 0 \hfill & 0 \hfill \\ 0 \hfill & {\lambda_{2ORG} } \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & {\lambda_{3ORG} } \hfill \\ \end{array} } \right], \varvec{D}_{DST} = \left[ {\begin{array}{*{20}l} {\lambda_{1DST} } \hfill & 0 \hfill & 0 \hfill \\ 0 \hfill & {\lambda_{2DST} } \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & {\lambda_{3DST} } \hfill \\ \end{array} } \right] $$
(13)

Now, the color matching matrix MEigen corresponding to Eq. (11) is given by

$$ \begin{aligned} \varvec{M}_{Eigen} = & \,\varSigma_{DST}^{1/2} \varSigma_{ORG}^{ - 1/2} \\ = & \,\left( {\varvec{A}_{DST}^{ - 1} \varvec{D}_{DST}^{1/2} \varvec{A}_{DST} } \right)\left( {\varvec{A}_{ORG}^{ - 1} \varvec{D}_{ORG}^{1/2} \varvec{A}_{ORG} } \right)^{ - 1} \\ = & \,\left( {\varvec{A}_{DST}^{ - 1} \varvec{D}_{DST}^{1/2} \varvec{A}_{DST} } \right)\left( {\varvec{A}_{ORG}^{ - 1} \varvec{D}_{ORG}^{ - 1/2} \varvec{A}_{ORG} } \right) \\ \end{aligned} $$
(14)

3.2 Singular Value Decomposition (SVD)

A m × n Matrix Σ is decomposed by SVD as the product of matrices U, V, and W

$$ \varvec{\varSigma}= \varvec{UWV} $$
(15)

Where, U and V are m × m and n × n orthogonal matrices. If Σ is a m × n rectangular matrix of rank-r, matrix W is composed of r × r diagonal matrix with the singular values as its entries and the remaining small null matrices.

Because the covariance Σ is a 3 × 3 real symmetric matrix, the singular values equal to the eigenvalues and SVD equals EVD in Eq. (12).

3.3 Cholesky Decomposition

Cholesky, a compact spectral decomposition method, decomposes the covariance Σ as a simple product of lower triangular matrix and its transpose as follows.

$$ \begin{aligned} & \varSigma_{ORG} = \varvec{L}_{ORG} \varvec{L}_{ORG}^{T} \;for\; \varvec{L}_{ORG} = Chol\left[ {\varSigma_{ORG} } \right]^{T} :T = transpose \\ & \varSigma_{DST} = \varvec{L}_{DST} \varvec{L}_{DST}^{T} \;for \;\varvec{L}_{DST} = Chol\left[ {\varSigma_{DST} } \right]^{T} \\ \end{aligned} $$
(16)

Where, Chol[*] denotes the Cholesky decomposition. The lower triangular matrix L is obtained by the iteration just like as Gaussian elimination method (details omitted).

The color matching matrix MChol to transfer the color atmosphere of target image into the source is given by

$$ \varvec{M}_{Chol} = L_{DST} \left( {L_{ORG} } \right)^{ - 1} $$
(17)

4 Color Transfer by PCM After Mapping to Visual Cortex

4.1 Retina to Visual Cortex Mapping by Log Polar Transform

The PCM model works well to transfer the color atmosphere between the images even with color dissimilarities. However, any human visual characteristic has not been taken into account. In this paper, a striking feature in the spatial color distributions in our visual cortex image is introduced to improve the performance in PCM.

The mapping to visual cortex from retina is mathematically described by Schwartz’s complex Logarithmic Polar Transform (LPT) [17].

The complex vector z pointing a pixel located at (x, y) in the retina is transformed to a new vector log (z) by LPT as follows.

$$ \begin{aligned} & z = x + jy = \rho e^{j\theta } \,\text{;}\,\rho = \left| z \right| and \,\theta = tan^{ - 1} \left( {y/x} \right) \\ & log\left( z \right) = u + jv = log\left( \rho \right) + j\,\theta \, ;\,\,j = \sqrt { - 1} \\ \end{aligned} $$
(18)

The retinal image is sampled at spatially-variant resolution on the polar coordinate (ρ, θ), that is, in the radial direction, fine in the fovea but coarser towards peripheral according to the logarithm of ρ, while in the angle direction, at a constant pitch Δθ and stored to the coordinate (u, v) in the striate cortex V1. Figure 4 illustrates a sketch how the retinal image is sampled, stored in the striate cortex, and played back to retina.

Fig. 4.
figure 4

Outline of Spatially-variant Mapping to Visual Cortex from Retina

4.2 Discrete Log Polar Transform

In the discrete LPT system, (ρ, θ) is digitized to R number of rings and S number of sectors. The striate cortex image is stored in the new Cartesian coordinates (u, v) as

$$ \begin{aligned} & \left( {u,v} \right)\,\underline{\underline{\Delta }} \left\{ {\rho \left( u \right),\theta \left( v \right)} \right\}\, \\ & \quad \; \rho \left( u \right) = \rho_{0} a^{u} \;for\; \rho \ge \rho_{0} , u = 1,2, \cdots ,R \\ & \quad \;a = exp\left[ {log\left( {\rho_{max} /\rho_{0} } \right)/R} \right] \\ & \quad \;\theta \left( v \right) = v\Delta \theta = \left( {2\pi /S} \right)v\, for\, v = 1,2, \cdots ,S \\ \end{aligned} $$
(19)

ρ0 denotes the radius of blind spot and ρ ≥ ρ0 prevents for the points near origin not to be mapped to the negative infinite-point. This regulation is called CBS (Central Blind Spot) model. Figure 5 illustrates how the image “sunflower” is sampled in LPT lattice and transformed to striate cortex image, then stored in the coordinates (u, v).

Fig. 5.
figure 5

Image “sunflower” sampled in LPT lattice, transformed and stored in Striate Cortex

The height h(u) and width w(u) of an unit cell between u + 1 and u are given by the following equations. Hence the area α(u) of unit cell increases exponentially with u.

$$ \begin{aligned} & h\left( u \right) = \rho \left( {u + 1} \right) - \rho \left( u \right) = \rho_{0} \left( {a - 1} \right)a^{u} \\ & w\left( u \right) = \frac{1}{2}\left( {2\pi /S} \right)\left\{ {\rho \left( {u + 1} \right) + \rho \left( u \right)} \right\} = \left( {\pi /S} \right)\left( {1 + a} \right)a^{u} \rho_{0} \\ & \alpha \left( u \right) = h\left( u \right)w\left( u \right) = \pi \rho_{0}^{2} \left( {a^{2} - 1} \right)a^{2u} S^{ - 1} \\ \end{aligned} $$
(20)

As sensed in Fig. 5, the color is sampled finer in the center but coarser towards peripheral. The pixels in the yellow petals occupy larger area than peripheral. This spatially-variant characteristics to collect the color information on the viewpoint must be reflected in the population density in the color distribution of striate cortex image.

Figure 6 is another example for a pink rose “cherry shell”. It shows how the color distribution is concentrated on the pinkish petal area around at the central viewpoint in the striate cortex image. Hence it’ll be better for applying PCM not on the original but on the striate cortex image after LPT to perform the color matching more effective for the object of attention.

Fig. 6.
figure 6

Spatially-variant color concentration effect in striate cortex image by LPT

Now the basic PCM matrix MPCM in Eq. (11) is applied to the covariance after LPT and we get newly the following color transfer matrix.

$$ \begin{aligned} \varvec{M}_{LPTPCM} = & \,\left( {{}_{LPT}\varvec{A}_{DST}^{ - 1} } \right)\left( {{}_{LPT}\varvec{S}} \right)\left( {{}_{LPT}\varvec{A}_{ORG} } \right) \\ {}_{LPT}\varvec{S} = & \left[ {\begin{array}{*{20}l} {\sqrt {{}_{LPT}\lambda_{1DST} /{}_{LPT}\lambda_{1ORG} } } \hfill & 0 \hfill & 0 \hfill \\ 0 \hfill & {\sqrt {{}_{LPT}\lambda_{2DST} /{}_{LPT}\lambda_{2ORG} } \quad \quad \,} \hfill & 0 \hfill \\ 0 \hfill & 0 \hfill & {\sqrt {{}_{LPT}\lambda_{3DST} /{}_{LPT}\lambda_{3ORG} } } \hfill \\ \end{array} } \right] \\ \end{aligned} $$
(21)

Figure 7 illustrates the color transfer process in LPTPCM model. In this sample, both the source image A and target image B are first transformed to the visual cortex images by LPT, then the clustered color distribution in cortex image A is transformed to match with that of cortex image by PCM. As a result, the material appearance of greenish transparent wine glass B looks to be transferred to that of gold mask image A.

Fig. 7.
figure 7

Improved LPTPCM color transfer model

Since the original images A and B have color dissimilarity, it’s a hard to make the color matching only by the single use of basic PCM. While, by just placing LPT before PCM, the feeling of greenish wine glass B is well conveyed to that of gold mask A.

5 Experimental Results and Discussions

The performance of proposed LPTPCM model is compared with the other methods mentioned in Sect. 3. Figure 8 shows the results for the same images used in Fig. 7. The lαβ model fails for such images with color dissimilarity. The source image colors remain almost unchanged. Eigenvalue and Cholesky decomposition methods reflect the greenish target colors a little bit, but look unnatural. In the basic PCM model, the black in eyes and the green in mask face seem to have replaced unnatural. Any mismatches in the directions of PC axes might occur. While, LPTPCM model worked successful for transferring the color atmosphere of wine glass to that of gold mask.

Fig. 8.
figure 8

Performance of LPTPCM model in comparison with other methods

Figure 9 shows another example for color transfer between three glass vases with different patterns. As well, lαβ was hardly function remaining the source colors almost unchanged. Though Eigenvalue and Cholesky decomposition methods showed certain effects, a partial color mixing happened between the source B and target A as shown in B to A color matching. PCM and LPTPCM looks like a neck and neck. But looking carefully, LPTPCM gives a little bit better impression than PCM due to conveying the clean textures in the target.

Fig. 9.
figure 9

Example for color transfer between three glass vases with different patterns

Figure 10 is a comparison in PCM and LPTPCM for handcraft pots. Both achieved the expected results. It’s hard to tell which is better. How to make a quantitative evaluation is left behind as a future challenge.

Fig. 10.
figure 10

Almost neck and neck results in PCM vs. LPCPCM for handcraft pots

On the other hand, Fig. 11 shows a result for color transfer between the images with heterogeneous textures. (a) tried to transfer the color atmosphere of “greenish wine glass” to that of “reddish Porsche”, where only LPTPCM was successful.

Fig. 11.
figure 11

A result for color transfer between the images with heterogeneous textures

Figure 12 shows the performance between PCM vs. LPTPCM in case of changing the target image B to the gold mask or handcraft pot B. In the upper case of gold mask target, LPTPCM clearly reflects the feeling of the target, but in the lower case of green pot target, it’s hard to tell which is better, maybe, depending on personal preference.

Fig. 12.
figure 12

Comparisons in PCM vs. LPTPCM for changing the target images

For the sake of simplicity, the basic PCM is applied assuming a single clustered image. In the case of multi-clustered image, any segmentation is needed for separeting the colored objects to each cluster then the oblect-to-object PCM is performed. But, it’s hard to find the corresponding pair of objects particularly in the case of dissimilar color images [12,13,14]. Hence, the proposed model is not universal but limited to the images handled as a single cluster. Also, it should be noted on the margin of image background. Figure 13 shows how the results in PCM differes by the margin of background, because the white margins influence on the image color clusters. As clearly seen, LPTPCM is insensitive to the margins and robust than PCM. The reason why comes from that LPT mimics the retina to/from cortex imaging called Foveation.

Fig. 13.
figure 13

Comparisons in PCM vs. LPTPCM for the different margin of image background

6 Conclusions

This paper challenged to apply the scene color transfer methods to the material appearance transfer. The proposed LPTPCM model is a joint LPT-PCM algorithm. Prior to PCM (Principal Component Matching), the source A and Target B retinal images are transformed to striate cortex images by LPT (Log-Polar-Transform). The key is to make use of color concentration characteristics on the central viewpoint of striate cortex by LPT. The performance of conventional PCM is significantly enhanced by the cooperation with LPT. The proposed model transfers the color atmosphere of target image B to that of source image A without any a priori information or optical measurement for the material properties. The question is how to evaluate the transformed image is perceptually acceptable or not. Any quantitative quality measure is hoped to be developed and is left behind as a future work.