Keywords

1 Introduction

Physically correct and visually convincing virtual models require object surfaces covered with realistic nature-like surface material textures to present realism in virtual scenes. The primary purpose of any synthetic texture approach is to reproduce and enlarge a given measured material texture so that ideally both natural and synthetic texture will be visually indiscernible. The appearance of real materials dramatically changes with illumination and viewing variations and its most advanced current texture representation is the seven-dimensional Bidirectional Texture Function (BTF) [9]. Unfortunately, measured texture data are nearly always too limited to reliable estimate these complex seven-dimensional models, thus their modeling requires some simplifying factorization [9], such as the presented compound random field models which serve as the the three-dimensional factor model in the complex overall BTF material model [9].

Compound random field models (CRF) consist of several sub-models each having different characteristics along with an underlying structure model which controls transitions between these sub models [11]. Compound Markov filed models were successfully applied to image restoration [2, 3, 11, 12], segmentation [13], and modeling [6, 7, 10]. However, these models always require demanding numerical solutions with all their well known drawbacks. Our exceptional CMRF [6] model allows analytical synthesis at the cost of a slightly compromised compression rate.

We propose two textural models - CRF\(^{BM-3CAR}\), CRF\(^{GM-3CAR}\), based on complex spatial probabilistic mixture models. These control field models are either probabilistic Bernoulli of Gaussian mixture models.

2 Compound Random Field Texture Models

Let us denote a multiindex \( r= (r_1,r_2),\, r\in I,\) where I is a discrete 2-dimensional rectangular lattice and \(r_1\) is the row and \(r_2\) the column index, respectively. \(X_r \in \lbrace 1,2,\ldots ,K \rbrace \) is a random variable with natural number value (a positive integer), \(Y_r\)  is multispectral pixel at location r and \(Y_{r,j} \in \mathcal{R}\) is its j-th spectral plane component. Both random fields (XY) are indexed on the same lattice I. Let us assume that each multispectral or BTF observed texture \(\tilde{Y}\) (composed of d spectral planes) can be modelled by a compound random field model, where the principal random field X controls switching to a regional local model \(Y = \bigcup _{i=1}^K\,{^iY}\). Single K regional sub-models \(^iY\) are defined on their corresponding lattice subsets \(^iI, \ ^iI\cap \, ^jI= \emptyset \ \ \ \forall i\not = j\) and they are of the same RF type. They differ only in their contextual support sets \(^iI_r\) and corresponding parameters sets \(^i\theta \). The CRF model has posterior probability \( P(X,Y \,|\, \tilde{Y}) = P(Y \,|\, X,\tilde{Y}) P(X\,|\, \tilde{Y}) \) and the corresponding optimal MAP solution is: \( (\hat{X},\hat{Y}) = \arg \max _{X\in \varOmega _X, Y\in \varOmega _Y} P(Y \,|\, X,\tilde{Y})\, P(X\,|\, \tilde{Y}),\) where \(\varOmega _X, \varOmega _Y\) are corresponding configuration spaces for random fields (XY). To avoid an iterative MCMC MAP solution, we propose the following two step approximation [6]:

$$\begin{aligned} (\breve{X})= & {} \arg \max _{X\in \varOmega _X} P(X\,|\, \tilde{Y}), \end{aligned}$$
(1)
$$\begin{aligned} (\breve{Y})= & {} \arg \max _{Y\in \varOmega _Y} P(Y \,|\,\breve{X},\tilde{Y}). \end{aligned}$$
(2)

This approximation significantly simplifies the CRF\(^{BM-3CAR}\), CRF\(^{GM-3CAR}\) estimation because it allows us to take advantage of a straightforward analytical estimation of the regional RF models \({^iY}\) in (2).

2.1 Region Switching Model

The control RF \((P(X\,|\, \tilde{Y}))\) is supposed to be represented by two-dimensional Bernoulli or Gaussian distribution mixture model, respectively. The mixture distribution \(P( Y_{\{r\}} )\) has the form:

$$\begin{aligned} P( Y_{\{r\}} ) = \sum _{m\in \mathcal{M}} P(Y_{\{r\}} \,|\,m) \, p(m) = \sum _{m\in \mathcal{M}} \prod _{s\in I_r } p_{s}( Y_{s} \,|\, m) \, p(m) \end{aligned}$$
(3)

where \( Y_{\{r\}} \in \mathcal{K}^{\eta }\), \(\mathcal{M} = \{1, 2, \dots , M\}\), \(I_r \subset I\) is an index set, \(\eta = cardinality\{I_r\}\), and p(m) are probability weights \( \sum _{m\in \mathcal{M}} p(m) =1.\) The maximum-likelihood parameter estimates p(m) (probability weights), \(\mu _{m s}, \sigma _{m s}\) (Gaussian mixture component means and standard deviation), \(\theta _{m,s}\) (Bernoulli mixture component parameters) are computed using the EM algorithm [1, 4] \(p^{(t+1)}_{s}(.\,|\,m)\) and

$$\begin{aligned} q^{(t)}(m \,|\, X_{\{r\}})= & {} \frac{p^{(t)}(m)\, P^{(t)}(X_{\{r\}}\,|\,m)}{\sum _{j\in \mathcal {M}} p^{(t)}(j) P^{(t)}(X_{\{r\}}\,|\, j)}, \end{aligned}$$
(4)
$$\begin{aligned} p^{(t+1)}(m)= & {} \frac{1}{|\mathcal {S}|} \sum _{X_{\{r\}} \in \mathcal {S}} q^{(t)}(m\,|\,\,X_{\{r\}}). \end{aligned}$$
(5)

Bernoulli Distribution Mixture Model. We assume that control field pixel \(X_r \in \mathcal {K}\) where \( \mathcal {K}\) is the index set of K distinguished sub-models. The distribution \(P( X_{\{r\}} )\) is assumed to be multivariable Bernoulli mixture (BM) and the control field is further decomposed into separate binary bit planes of binary variables \(\xi \in \mathcal {B}\), \(\mathcal {B} = \{0,1\}\) which are separately modeled and can be learned from much smaller training texture than a multi-level discrete mixture model. We suppose that a bit factor of a control field can be fully characterised by a marginal probability distribution of binary levels on pixels within the scope of a window centered around the location r and specified by the index set \(I_r \subset I\), i. e. \( X_{\{r\} } \in \mathcal {B}^{\eta }\) and \(P(X_{\{r\} } ) \) is the corresponding marginal distribution of \(P(X\,|\, \tilde{Y})\). The component distributions \(P(\cdot \,|\,m)\) are factorisable, and multivariable Bernoulli:

$$\begin{aligned} P(X_{\{r\} } \,|\,m) = \prod _{s\in I_r} \theta _{m,s}^{X_s}(1-\theta _{m,s})^{1-X_s} \qquad X_{s} \in X_{\{r\}}. \end{aligned}$$
(6)

The mixture model parameters (6) include component weights p(m) and the univariate discrete distributions of binary levels. They are simply defined by one parameter \(\theta _{m,s}\) as a vector of probabilities:

$$\begin{aligned} p_{s}(\cdot \,|\,m) = (\theta _{m,s}, 1-\theta _{m,s}). \end{aligned}$$
(7)

The EM solution is (4), (5) and

$$\begin{aligned} p^{(t+1)}_{s}(\xi \,|\,m) = \frac{1}{|\mathcal {S}|\,p^{(t+1)}(m)} \sum _{X_{\{r\}} \in \mathcal {S}} \delta (\xi ,X_{s})\, q^{(t)}(m\,|\, X_{\{r\}} ), \ \ \ \xi \in \mathcal {B}. \end{aligned}$$
(8)

The total number of mixture (3), (7) parameters is thus \(M (1+\eta )\) – confined to the appropriate norming conditions. The advantage of the multivariable Bernoulli model (7) is a simple switch-over to any marginal distribution by deleting superfluous terms in the products \(P( X_{\{r\}} \,|\,m)\).

Gaussian Mixture Model. The discrete control field can be alternatively modeled by a continuous RF if we map single indices into continuous random variables with uniformly separated mean values and small variance. The synthesis results are subsequently inversely mapped back into a corresponding synthetic discrete control field. We assume the joint probability distribution \(P(X_{\{r\}})\), \(X_{\{r\}} \in \mathcal {K}^{\eta }\) in the form of a normal mixture and the mixture components are defined as products of univariate Gaussian densities

$$\begin{aligned} P(X_{\{r\}} \,|\, \mu _m, \sigma _m)= & {} \prod _{s\in I_{\{r\}}} p_s (X_s \,|\, \mu _{m s},\sigma _{m s}), \\ p_s(X_s \,|\, \mu _{m s},\sigma _{m s})= & {} \frac{1}{\sqrt{2 \pi }\sigma _{m s}} \exp \left\{ - \frac{(X_s-\mu _{m s})^2}{2 \sigma _{m s}^2} \right\} ,\nonumber \end{aligned}$$
(9)

i. e., the components are multivariate Gaussian densities with diagonal covariance matrices. The maximum-likelihood estimates of the parameters \(p(m), \mu _{m s}, \sigma _{m s}\) can be computed by EM algorithm [1, 4]. Anew we use a data set \(\mathcal {S}\) obtained by pixel-wise shifting the observation window within the original texture image \( \mathcal {S}= \{X_{\{r\}}^{(1)},\dots ,X_{\{r\}}^{(K)}\}, \ \ X_{\{r\}}^{(k)}\subset X. \) The corresponding log-likelihood function is maximized by the EM algorithm \((m\in \mathcal {M}, n\in \mathcal {N}, X_{\{r\}}\in \mathcal {S})\) and the iterations are (4), (5) and

$$\begin{aligned} \mu ^{(t+1)}_{m,n}= & {} \frac{1}{\sum _{X_{\{r\}} \in \mathcal {S}} \, q^{(t)}(m \,|\, X_{\{r\}})} \sum _{X_{\{r\}} \in \mathcal {S}} X_n\, q(m \,|\, X_{\{r\}}), \end{aligned}$$
(10)
$$\begin{aligned} (\sigma ^{(t+1)}_{m,n})^2= & {} -(\mu ^{(t+1)}_{m,n})^2 + { \sum _{X_{\{r\}} \in \mathcal {S}} X_n^2 \, q^{(t)}(m \,|\, X_{\{r\}}) \over {\sum _{X_{\{r\}} \in \mathcal {S}} q(m|X_{\{r\}})}}. \end{aligned}$$
(11)

Control Field Synthesis. We can assume at a given position r of the contextual neighbourhood \(I_r\) to have some part of the pixel-wise synthesised control field \(X_{\{r\}}\) already specified. If \(X_{\{ \rho \}} \) is a sub-vector of all of \(X_{\{r\}}\) pixels previously specified within this window and \(I_{\rho } \subset I_r\) the corresponding index subset, then the statistical properties of the remaining unspecified variables are fully described by the corresponding conditional distribution:

$$\begin{aligned} p_{n\,|\, \rho }(X_{n}\,|\, X_{\{ \rho \}}) = \sum _{m=1}^{M} W_{m}(X_{\{ \rho \}})\, p_{n}(X_{n}\,|\, m), \end{aligned}$$
(12)

where \(W_{m}(X_{\{\{\rho \}})\) are the a posteriori component weights corresponding to the given sub-vector \(X_{\{\rho \}}\):

$$\begin{aligned} W_{m}(X_{\{\rho \}} )= & {} \frac{p(m) P_{\rho }(X_{\{\rho \}} \, |\,m)}{\sum _{j=1}^{M} p(j) P_{\rho }(X_{\{\rho \}} \,|\,j)}, \\ P_{\rho }(X_{\{\rho \}}\,|\,m)= & {} \prod _{n\in \rho } p_{n}(X_{n}\,|\, m). \nonumber \end{aligned}$$
(13)

\(X_n\) can be randomly generated by the conditional distribution \( p_{n \,|\, \rho }(X_{n}\,|\, X_{\{\rho \}} )\) where by Eq. (12) can be applied to all the unspecified variables \(n= \eta -\ \text{ card }\{ \rho \}\) given a fixed position of the control field. Each newly generated \(X_n\) is used to upgrade the conditional weights \(W_{m}(X_{\{\rho \}} )\).

2.2 Local Markov Models

Local i-th texture region (not necessarily continuous) is represented by the adaptive 3D causal auto-regressive random (3DCAR) field model [5, 8]. This model can be analytically estimated as well as easily synthesised. The model can be defined in the following matrix equation (i-th model index is further omitted to simplify notation):

$$\begin{aligned} Y_{ r} =\, \gamma \, Z_{ r} + \epsilon _{ r}, \end{aligned}$$
(14)

where \( Z_{ r}=[Y_{ r-s}^T: \forall s\in \, I_{ r } ]^T \) is the \( \eta \, d \times 1\) data vector with multiindices rst, \(\gamma = [A_1,\ldots ,\, A_{\eta }]\) is the \(d \times d\, \, \eta \) unknown parameter matrix with parametric sub-matrices \(A_s\). The model functional contextual neighbour index shift set is denoted \( I_{ r} \) and \(\eta =\, cardinality( I_{ r })\). All CAR model statistics can be efficiently estimated analytically [8]. Given the known 3DCAR process history \( Y^{(t-1)} = \lbrace Y_{t-1},Y_{t-2},\ldots ,Y_1, Z_t,Z_{t-1},\ldots , Z_1\rbrace \) the parameter estimation \(\hat{\gamma }\) can be accomplished using fast, numerically robust and recursive statistics [8]:

Fig. 1.
figure 1

Cloth and frosted planks (left column) synthesis and enlargement (right column) using the \(CRF^{BM-3CAR}\) model.

Fig. 2.
figure 2

Measured three textile and two cobra skin textures (left column) and their synthesis using the \(CRF^{GM-3CAR}\) model.

where \(V_0\) is a positive definite matrix (see [8]). Although, an optimal causal functional contextual neighbourhood \( I_{ r} \) can be solved analytically by a straightforward generalisation of the Bayesian estimate in [8], we use faster approximation which does not need to evaluate statistics for all possible \( I_{ r} \) configurations. This approximation is based on spatial correlations. Starting from the causal part of a hierarchical non-causal neighbourhood, neighbours locations corresponding to spatial correlations larger than a specified threshold \(({>}0.6)\) are selected. The i-th model pixel-wise synthesis is simple direct application of (14) for all 3DCAR models. 3DCAR models provide better spectral modelling quality than the alternative spectrally decorrelated 2D models for motley textures at the cost of small increase of number of parameters to be stored.

3 Experiments

Both presented compound random field models (\(CRF^{BM-3CAR}\), \(CRF^{GM-3CAR}\)) are well suited for near-regular textures such as textile materials which are notoriously difficult for Markov random field type of textural models [6, 9]. The dimension of the estimated control field model distribution is not too high (\(\eta \approx 10^1 - 10^2\)) and the number of the training data vectors is relatively large (\(|\mathcal {S}| \approx 10^4 - 10^5\)). Nevertheless the window should always be kept reasonably small and the sample size as large as possible. In our experiments we have used a regular left-to-right and top-to-down shifting of the generating window. Figure 1 illustrates the \(CRF^{BM-3CAR}\) model applied to a frosted planks and two textile textures synthesis and enlargement, while Fig. 2 shows three textile materials and two skin samples synthesized using the \(CRF^{GM-3CAR}\) model.

4 Conclusion

Both presented CRF \((CRF^{GM-3CAR}, CRF^{BM-3CAR})\) methods show good visual performance on selected real-world measured materials. The appearance of such materials should consist of several types of relatively small regions with fine-granular inner structure such as fabric, skin, or wood. The models offer large data compression ratio (only tens of parameters per BTF) easy simulation and fast seamless synthesis of any required texture size. The methods can be easily generalised for colour or BTF texture editing by estimating some local models from different target materials. The model does not compromise spectral correlation thus it can reliably model motley textures. A drawback of the presented CRF models is the need to have sufficiently large learning data for both mixture sub-models.