1 Introduction

At the beginning of the 20th century, Munsell [1] established a system for specifying colors more precisely and showing the relationships among them. The Munsell color order system is based on the color-perception attributes of hue, value and chroma. Munsell defined numerical scales with visually uniform steps for each of these attributes. Hue is that attribute of a color by which we distinguish blue from red, yellow from green, and so on. Hues are naturally ordered in this scale: red (R), yellow-red (YR), yellow (Y), green-yellow (GY), green (G), blue-green (BG), blue (B), purple-blue (PB), purple and red-purple (RP). Black, white and the grays between them are called neutral colors (N). Value indicates the lightness of a color in a scale of value ranges from 0 (pure black) to 10 (pure white). Chroma is the degree of departure of a color from the neutral color of the same value. The scale starts from 0, for neutral colors, but there is no arbitrary end to the scale, as new pigments gradually become available. However, limits for representable chroma values have been defined by the so called MacAdam limits [2]. Specifying color by the Munsell system is a practice limited to opaque objects, such as soils or painted surfaces. This practice provides a simple visual method as an alternative to the more complex and precise method based on the CIE system and on spectrophotometry. For this reason, the Munsell system is adopted in contexts in which the recording or identification of colors of specimens (i.e., flowers, minerals, soils) is required [3]. The Munsell charts are appropriate for almost all jobs requiring color specification by visual means, as stated by specific neurobiological researches that demonstrated how that system has successfully standardized color in order to match the reflectance spectra of Munsells color chips with the sensitivity of the cells in the lateral geniculate nucleus (LGN cells), responsible for color specification [4]. In archaeology Munsell charts are widely used as the standard for color specification of organic materials, colored glass, soil profiles, rock materials, textiles, metals, colored glasses, paintings and principally pottery. Archaeologists are used to employing Munsell Soil Charts directly on a cultural heritage or excavation site to identify the colors of the soils and of the artifacts retrieved. Indeed, it is very useful in the examination, classification and genesis analysis of soils [5,6,7]. For which regards the interpretation of pottery the precise color specification of such parts like treated surfaces, clay body, core, and outer layers like painting and slip, it is fundamental for defining its stylistic and technical features. Color specification might be exploited to bind the artifacts to a specific culture, society or civilization or even to a certain period of time [8].

As previously mentioned, the standard practice of Munsell estimation exploiting the Soil Charts is by visual means. The two adjacent constant-hue charts or chips between which the hue of the specimen lies have to be chosen. Then, by moving the masks from chip to chip to find the most similar one to the specimen, one can estimate its value, chroma and hue [3]. As can be seen, this procedure is error prone, time consuming and very subjective. In order to obtain a more accurate estimation, the process described above should be repeated more than once and possibly also by other users, since colors might not be perceived uniformly by different people [9]. Hence, an objective and automatic Munsell estimation method would be a valuable improvement to the field of archaeology.

Digital cameras have been used before to acquire pictures of soil specimens in a laboratory with controlled lighting conditions. Then, the Munsell notation has been exploited to estimate the mineral and organic composition of the specimens [10,11,12,13,14,15]. However, all of these works still require a strictly controlled environment for the digital acquisition of the images. Suggested controls for a perfect estimation are related to artificial and natural lightning conditions, specimen and camera positions, angle of view, setting of the working plane and background with proper opaque and black materials to avoid light reflection [3]. Prepare a perfectly controlled environment is difficult, time-consuming and potentially expensive. With the spread of smartphones with ever more sensors onboard, particularly high resolution cameras, new methods exploiting the Munsell system have been developed. In [16] a mobile phone application for Munsell estimation under strictly controlled illumination conditions is presented. In [17] a similar setting is discussed, but focused on a Complementary Metal Oxide Semiconductor (CMOS) sensor assembled on a smartphone. Also in these cases, a controlled environment is required. In the past we have performed several experiments about this topic in a such environment [12, 14, 15, 18, 19]. However, to the best of our knowledge, a method using an uncontrolled setting for image acquisition is still missing. In this work, we present ARCA: Automatic Recognition of Color for Archaeology, a desktop application for Munsell estimation. ARCA is the core of the pipeline of our proposed method, consisting in the image acquisition of specimens, manual sampling of the image in a user-friendly way for archaeologists, Munsell estimation of the sampled points and creation of a sampling report (Fig. 1). We focused on the need of archaeologists to have a practical and tested application that might help them in the color specification task during an excavation. Through the proposed pipeline, archaeologists do not need expensive tools (i.e., spectophotometer, Munsell Soil Charts, color checker) or a laboratory with a controlled environment for the acquisition in order to perform color estimation. They just need to take a picture of the specimen, and moreover, no strict constraints need to be applied in advance. Then, from the ARCA application, they are able to select multiple samples at once and the system will estimate the Munsell notation for them in an objective and deterministic way.

Fig. 1.
figure 1

Pipeline of the proposed ARCA application.

A dataset of 108 images, called for this reason ARCA108, consisting of a total of 22, 848 samples, have been gathered in order to evaluate in an uncontrolled environment what the best configuration in which the image acquisition should be done. This dataset represents a new valuable asset for color specification research purposes. The Munsell system is usually exploited to establish and evaluate the color and gloss tolerance of specimens [20, 21]. We compared all the samples with Munsell reference values exploiting the CIEDE2000 (\(\varDelta E_{00}\)) color difference definition [20]. Several accuracy problem have been reported for the color specification task [22], so to be comparable with other Munsell estimation methods we will consider mean values and standard deviations from the evaluation phase.

The rest of the paper is structured as follows: in Sect. 2 the acquisition phase, validation phase and ARCA desktop application will be described. The experimental results are given in Sect. 3 and then final remarks and considerations conclude the paper in Sect. 4.

2 Material and Methods

Two main phases can be distinguished in our experiments: acquisition and validation. In the former, we wanted to simulate the most common situation of Munsell field-estimation as possible, while in the latter the main aim was to validate the proposed system, in order to prove its reliability in the Munsell estimation process. In the following subsections, the acquisition phase, ARCA desktop application and validation phase are detailed. The order in which they are presented is coherent with the proposed pipeline: acquire, sample and estimate.

2.1 Acquisition Phase

No strict constraints have been added in the acquisition phase, in order to allow an easy replicability of the process shown in this work. Two kinds of devices have been employed in our experiments: a professional reflex and a common smartphone. The reflex model was a Canon EOS 1200D (mounting an EFS 18–55 mm zoom lens model) with a resolution of 18 megapixels, while the smartphone model was a Nexus 5X with a main camera resolution of 12.2 megapixels. The subjects of the taken pictures were the following Munsell Soil Color Charts (Year 2000 Revised Washable Edition): GLEY1, GLEY2, 10R, 2.5YR, 5YR, 7.5YR, 10YR, 2.5Y, 5Y. A Gretag-Macbeth color checker has been also employed, in order to evaluate the gains of have reference colors during photos acquisition.

Our acquisition was set in Tampa, Florida (US), in GPS coords 28\(^\circ \)03’47.9”N 82\(^\circ \) 24’40.9”W, on March 8, that was an almost sunny day, with some cloud cover (Fig. 2(a)). It was performed from 10:30 am to 12:30 pm and with an unguided approach (Fig. 2(b)), so without any fixed positions or angles of view for the camera or subjects. We acquired the 9 charts of the Munsell Soil Color Charts, with the following possible settings:

  • 2 kinds of devices: professional DSLR (Digital Single Reflex Camera) and common smartphone;

  • 3 automatic white balancing algorithms (executed by the devices in the image capture phase): automatic, sunny (corresponding to standard illuminant D65: \(\,{\sim }6,500 K^\circ \)) and cloudy (corresponding to standard illuminant D75: \(\,{\sim }7,500 K^\circ \));

  • 1 fluorescence presetting: direct sunlight;

  • 1 ISO setting: 400 ISO;

  • 1 focus setting: autofocus;

  • 2 kind of subject: the chart itself and the chart with a Gretag-Macbeth color checker nearby.

Fig. 2.
figure 2

(a) The day in which acquisition phase has been performed was a sunny day with minor cloud cover. (b) Photos have been taken with an unguided approach.

In this way, we obtained a total of 12 configurations for each Munsell chart, gaining a total of 108 images. The resolution of the images is \(5184 \times 3456\) pixels and \(3840 \times 2160\) pixels for pictures taken by a DSLR camera and smartphone, respectively. All the images were saved in the standard JPG format, with a lossless setting for the quality (the highest possible).

The gathered dataset has been publicly released with the name ARCA108 and it is freely available at http://iplab.dmi.unict.it/ARCA108/.

2.2 ARCA Desktop Application

The current version of the ARCA desktop application has been developed in Matlab Graphical User Interface Design Environment (GUIDE). From the GUI the user is able to perform several actions: open an image, zoom in/out to focus on a detail of an image, sample the image (through pick-point or draw-a-region-by-freehand), remove the current sample, estimate the Munsell notation of the sample and save the report. The ARCA application GUI is shown in Fig. 3. Multiple samples can be selected through the pick-point tool; their Munsell notation estimation will be done at once when launched. After every estimation, indexed markers are added on the image, so the user can track all the samples with their own Munsell estimation. Samples are also highlighted with a red border (this is particularly useful when draw-a-region tool is used). Munsell conversion and \(\varDelta E_{00}\) computations for a validation phase are performed exploiting the publicly available Matlab toolbox by Centore [23, 24], that has been proved to be comparable with other not open-source conversion methods [25,26,27,28]. Finally, when the report of the estimation is going to be created, the user must provide a name for the report and a directory will be created with that name. The report is made up of three elements: the starting image with the indexed markers on it, a Matlab file and a textual report containing the list of Munsell estimations.

Fig. 3.
figure 3

Screenshot of the ARCA desktop application GUI. Three samples have been taken on the current image; marks are visible on the image so the user can visually track the estimated Munsell values. Now the user can keep sampling the image (adding new Munsell estimations) or save a report of the current estimation.

2.3 Validation Phase

We evaluated the system comparing the expected Munsell value of each chip in the Munsell charts with its observed one. We performed the sampling from the charts importing the images in our ARCA desktop application and manually picking points which were visually near to the centroid of each chip. We considered a patch of \(49 \times 49\) pixels around the picked centroid, for a total of 2, 401 pixels per chip. As done in [16], the Munsell charts labeled as GLEY1 and GLEY2 have not been evaluated, since they contain neutral colors very similar to one another’s and with very low chroma values. We sampled 238 chips (for each one of the 12 configurations), and for each sampled chip we computed mean, median and mode of the extracted patch. So, by also taking into account the RGB value in the centroid, we obtained 4 RGB values for each sampled chip. Using the Munsell toolbox by Centore [24] the sampled RGB values have been converted to the Munsell color space. We have also considered a discretized version of the converted RGB values, computed by rounding the converted values to the closest Munsell reference values in the Munsell charts. In this way, we obtained a total of 22, 848 Munsell observed values to be compared with the 238 expected ones.

3 Results

In the experimental setting, 12 possible configurations were defined (Sect. 2.1). We repeat that a “configuration” is one of the possible combination of the following settings: Device:[Reflex/Smartphone] + WhiteBalancing:[Auto/Sunny/Cloudy] + Subject:[Solo_Chart/With_Macbeth]. Moreover, for each sampled chip, 4 order statistics were investigated: mean, median, mode and centroid value have been exploited in the Munsell computation (Sect. 2.3). Since Munsell references are a discrete set of values, it is also possible to apply a discretization to the continuous Munsell values obtained after the conversion, so the order statistics to be taken into account become 8. Hence, several questions can be raised, and will be answered in the following subsections:

  1. 1.

    What is the best configuration, among the 12 defined?

  2. 2.

    What is the best order statistic, among the 8 defined?

  3. 3.

    How much is worthwhile the application of the discretization?

  4. 4.

    Is the error in the Munsell notation estimation acceptable?

3.1 Best Configuration

For each one of the 12 possible configurations, 7 Munsell charts were acquired. The average value of the \(\varDelta E_{00}\) between the Munsell reference chips and the 8 order statistics from every chip in the acquired charts has been computed. Results are shown in Fig. 4(a). From this chart it is possible to assess that the best configuration is [Reflex, Auto White Balancing, Solo Chart]. Instead, among the configurations that exploit the smartphone as device, the best configuration is [Smartphone, Sunny White Balancing, With Macbeth]. It is interesting, and almost surprising, to notice how the use of a color checker together with a reflex professional camera increases the \(\varDelta E_{00}\) distance, while together with a general purpose smartphone it has a positive influence decreasing the distance. Hence, in our best configuration none expensive color checker is needed.

3.2 Best Order Statistic

The average value of the \(\varDelta E_{00}\) between the Munsell reference chips and the 8 order statistics from every chip in the whole dataset has been computed. Results are shown in Fig. 4(b). The values of the order statistics, respectively with and without quantization, is almost similar, besides for 2.5Y and 5Y Munsell charts where it is almost the same in both the cases. The mean slightly outperforms the other order statistics. Additional evidence coming from this chart is that quantization decrease the \(\varDelta E_{00}\) distance in almost the totality of the cases. This state directly brings to the successive question.

Fig. 4.
figure 4

Validation plots. (a) Investigation of the best configuration among the 12 tested. (b) Investigation of the best order statistic to be used on the patch during the Munsell estimation. Note how the discretization decreases the \(\varDelta E_{00}\) in almost the totality of the cases, as expected.

3.3 Discretization

Munsell Soil Charts contain a discrete set of reference values, but conversion from RGB to Munsell System generates values in a continuous system. Archaeologists are used to employ only the discrete values, not the continuous ones, so a discretization is needed. Since we consider all the values near to a reference one as the same, the discretization to the closest Munsell reference value will matter in the \(\varDelta E_{00}\) computation. We counted how many times discretization actually decreased or increased the initial \(\varDelta E_{00}\). In the \(59.42\%\) of the cases a positive gain has been obtained. Moreover, the negative gain is usually obtained with low chroma values, which are the most ambiguous to be classified. From the result shown in Fig. 4(b) and this other cue it is possible to assess that it is worthwhile to apply discretization, as expected.

3.4 Color Tolerance

The issue related to the amount of acceptable error on a Munsell estimation is known as color tolerance. The tolerance ranges change with respect to the application for which the estimation is made for. The standard definition is that same colors should have \(\varDelta E_{00}=1\) [20]. In the industrial field two colors can be considered the same (imperceptible differences) only if the \(\varDelta E_{00}\) is lesser than 2. However, this strong criteria are usually relaxed introducing “tolerable” ranges: until 3–4 CIELAB units can be considered the same colors, until 5–6 CIELAB units the colors are hardly distinguishable, higher than 6 CIELAB units classification performance starts to decrease [16]. Moreover, the colors printed in the Munsell reference Soil Charts are usually affected by an intrinsic error from \({\sim }\)1 to \({\sim }\)4 CIELAB units, where higher error is found in elder Charts [22]. Related works employing smartphones during acquisition phase in a controlled environment have reported an error in the estimation of \(3.75\,\pm \,1.8\) CIELAB units [16, 17]. In Table 1 the mean and standard deviation values of \(\varDelta E_{00}\) computed during the validation phase have been reported. As previewed by Fig. 4(a), the best configuration is [Reflex, Auto White Balancing, Solo Chart], which has \(4.95\,\pm \,2.89\) CIELAB units of error. Performances drastically drop with other configurations. The best configuration for smartphone, that is [Smartphone, Sunny White Balancing, With Macbeth], has \(8.20\,\pm \,2.71\) CIELAB units of error. To summarize, taking into account all the previous considerations about intrinsic error of Munsell Charts and the unconstrained experimental setting, the error obtained with our best configuration (employing the reflex) seems reasonable.

Table 1. Mean and standard deviation of Munsell estimation for each one of the 12 defined configurations.

4 Conclusions

In this work, ARCA: Automatic Recognition of Color for Archaeology, a desktop application for Munsell estimation, has been presented. We focused on the need of archaeologists to have a practical and tested application that might help them in the color specification task during an excavation. The following pipeline for Munsell notation estimation, aimed at archaeologists, has been proposed: image acquisition of specimens, manual sampling of the image in the ARCA desktop application, automatic Munsell estimation of the sampled points and creation of a sampling report. Differently from our previous works [12, 14, 15], we performed the whole experiments in an uncontrolled environment. A dataset of 22, 848 samples has been gathered under the uncontrolled environment assumption and evaluated with respect to the Munsell reference Soil Charts. This dataset has been called ARCA108 and it represents a new valuable asset for color specification research purposes. We defined 8 possible order statics for characterize the samples and 12 possible configurations during the acquisition phase . Experimental results shown that the defined order statistics reach very similar results, and that discretization of the converted Munsell notation decreases the error of \({\sim }\)1 CIELAB unit. The best configuration among the tested ones is [Reflex, Auto White Balancing, Solo Chart], with \(4.95\,\pm \,2.89\) CIELAB units of error. Compared to other related works, taking into account intrinsic error of Munsell reference Soil Charts and the uncontrolled experimental setting, this result is encouraging and reasonable. We proved that ARCA can represent for archaeologists a valid tool for color specification. ARCA allows archaeologists to select multiple samples and estimate the corresponding Munsell notation at once, in a fast, objective and deterministic way, avoiding the error-prone and time-consuming procedure of Munsell Estimation by visual means and without any expensive tool like spectophotometer, Munsell Soil Charts or Gretag-Macbeth color checker. For future works, we are planning to improve the ARCA application (i.e., image processing algorithms for noise reduction, deployment of a mobile version), to expand the validation phase acquiring other Munsell Soil Charts from Tropical Soils edition and, most of all, to conduct a color specification test-case on archaeological soils and pottery.