Keywords

1 Introduction

From time immemorial, making and remaking objects or artefacts is an activity engaged in by people in different communities and environment. Because the objects are unique to a people or environment, it can usually be referred to as cultural heritage artefacts. To make these artefacts as intended, certain methods and processes are adhered to. They include planning, material gathering and production. These processes have been streamlined from generation to generation. By reason of use, among other reasons, these artefacts tend to degrade, wear and even break. Over time and due to the loss of civilisations, some artefacts of value are lost. One of such artefacts, that this paper focuses on, is pottery.

These artefacts, and pottery in particular, are of significant interest to certain professionals such as archaeologists. As archaeologists study the historic and prehistoric human activities through documenting discovered ancient artefacts to have a grounded understanding of the prehistoric culture, the ability to carry out accurate restorations within reasonable time become pertinent. These can then be used for further studies.

Of all the artefacts recovered from archaeological excavations, pottery is one of the most significant because of its ability not to decay when compared with other archaeological artefacts of other materials [1]. Depending on how the pottery is preserved, it could be recovered as a complete pot on site or as fragments known as potsherds. Most potsherds usually carry information such as the shape of the pottery, decoration on the pottery and colour of the pottery among others, on the inner and/or outer surfaces [2]. This information helps the archaeologist to carry out studies such as dating, and to understand the culture of the potter. However, to be able to carry out a meaningful study, there is a need to restore these potteries as much as possible for a more comprehensive understanding of what is being studied. Thus, the reconstruction problem exists.

Archaeologists have employed different time-consuming ways of studying and preserving historical relics. The methods include physical examination, classification, illustration and reconstruction of pottery through drawing. These methods are carried out manually, making the archaeologist spend time and effort to observe and possibly extrapolate from the observations [3]. As the use of computing techniques and technologies become prevalent and accessible, applying them to solve critical issues in the archaeological field become desirable. Such applications are carried out in two-dimensional (2D) space or three-dimensional (3D) space.

The 3D reconstruction of pottery, in the form of virtual digitisation, has had similar and related studies with room for improvement. To successfully reconstruct pottery, certain processes are carried out. They include data acquisition, data pre-processing, data processing and data post-processing. The data acquisition stage involves the use of tools and instruments to acquire data for pre-processing. Data pre-processing is a combination of processes to refine and analyse the data acquired to have a clean data to carry out studies. Data processing is the stage where the core of the study is carried out using the clean data. The data post-processing stage is not a compulsory stage but necessary for the archaeologist to understand what the researcher has done. How image analysis techniques could be useful in the archaeological context and how these techniques could be put to good use with pottery reconstruction is a focus in this study.

This paper approaches this challenge by presenting a path towards reconstructing archaeological pottery using point cloud data. Multiple views of the pottery were acquired as point clouds and aligned to form a whole through the registration process. To achieve this, feature detection, feature selection and feature extraction techniques have been utilised. Since the path to an optimal registration process begins with the quality of data and how it is collected, it was ensured that the point clouds were captured with the utmost accuracy. Three different objects of archaeological significance were virtualised and studied. However, the focus of this paper is on the virtualisation of one of them – an oil lamp fractured pottery and its sherd, as shown in Fig. 1. This pottery was excavated at the Claternae archaeological site of the metropolitan city of Bologna.

Fig. 1.
figure 1

(a) Oil lamp fractured pottery (b) Oil lamp sherd

The remainder of the paper is structured as follows: Sect. 2 discusses previous works related to pottery reconstruction while Sect. 3 describes the data acquisition procedure. Section 4 presents the reconstruction process and the method applied. Section 5 discusses highlights of this study while Sect. 6 concludes the discussion.

2 Related Work

Many studies have attempted to solve the reconstruction problem. A well and accurately reconstructed or digitised pottery paves the way for further successful studies. For example, the study of Kampel and Sablatnig [4] developed a system that can process both complete and broken vessels. This was achieved using two reconstruction strategies known as: “shape from silhouette based method for complete vessels and a profile based method for fragments” [4]. While using these strategies have improved performance with an acceptable accuracy, it is nonetheless dependent on certain conditions being met, thus requiring further investigations that will improve accuracy.

Likewise, Smith et al. [5] investigated methods of classification and reconstruction of images of excavated archaeological ceramic fragments based on their colour and texture information. This was achieved by using Scale Invariant Feature Transform (SIFT) and a feature descriptor based on Total Variation Geometry (TVG) [5]. However, the performance of these descriptors was not satisfactory in terms of effectiveness and accuracy. Also, the study of Puglisi et al. [6] proposed a system that uses SIFT. The system uses image processing techniques to automatically identify and analyse images of the structural components of pottery through “optical microscopy”. As a result, the suitable features can then be computed and analysed for classification purposes. In addition, this system aims at segmenting the acquired images and extracting their features for pottery classification. To achieve this and for the sake of better accuracy, the authors chose to use SIFT feature point method, where the feature points are extracted and matched with related pairs. While this system improves on the ones before it, it falls short in providing an optimal solution with a high accuracy, hence opening the path to further studies [6].

Furthermore, Makridis and Daras [7] focused on a technique for accurate and automatic classification of pottery. The technique was implemented in four steps namely: feature extraction, feature fusion, feature selection and classification. This approach attempted to reduce the computational complexity problem with a “bag of words (BoW)” method that uses the features extracted from the images to create a “global descriptor” vector for the images [7]. However, while this technique turned out to be quite efficient, how and if this solution reduces computational complexity is still an open issue.

Finally, Roman-Rangel et al. [8] proposed Histogram of Spherical Orientations (HoSO), a method for automatic analysis of pottery by applying computer vision techniques. This method analyses the external frontal view of the pottery alone, processes 3D data by using only the points coordinates without using colour, texture or faces, and efficiently encodes the information from the points coordinates. They posit that the advantages include substantial time reduction in pottery organisation and a simple method of acquiring the image [8]. This shows that advantages are inherent in the application of computer vision techniques.

In this paper, computer vision techniques were applied with a focus on working with 3D point clouds. The point clouds use the point coordinates as well as its colour and normal information. The Principal Component Analysis (PCA) technique was applied in analysing and processing the point clouds.

3 Data Acquisition

The data acquisition stage is critical to the entire stages of a study. Virtualising objects to reflect majority of its original information is important in carrying out studies. As noted by [9], when objects are virtualised without key information, data may be lost.

To attain the virtualisation process in this study, a low-cost off-the-shelf acquisition setup, consisting of a line laser, RGB camera, Arduino UNO and other embedded devices and components, was used. The line laser was set up perpendicularly to the sliding table while the camera was set up at an angle and facing the object mount or sliding table. A representation of the acquisition setup is shown in Fig. 2. The reason for using a custom setup is to allow for tuning and adjusting with ease and where necessary to get desired result. By doing so, it can be ensured that the pottery virtualisation is ideal. But the system had to be calibrated first.

Fig. 2.
figure 2

Data acquisition setup

3.1 Calibration

To calibrate the system, a 1920 × 1200 pixels Daheng mercury U3V camera, a cyclic chessboard pattern and a sawtooth image were used to estimate camera intrinsic, extrinsic, and lens distortion parameters using the Brown’s distortion model [10]. Twenty chessboard PNG images and four sawtooth PNG images were captured and saved, ensuring that they are within the working distance of the camera and not blurry. The images were captured at different positions and orientations within the view of the camera at an angle of 40.5° orientation. The baseline is about 221 mm, while the height of the camera to the sliding table level is about 189 mm. The working distance, which is the distance between the lens and the object mount, is about 291 mm.

A simple custom calibrator adapted from the approach of [11,12,13,14] was used for calibration. To compute the intrinsic parameters, the chessboard images were added to the calibrator and the calibration parameters (width and height of pattern and image) and distortion coefficient types were selected (2 radial and 2 tangential coefficients). The calibrator detects the centre of each circular pattern for all the images (see Fig. 3) and computes the reprojection errors. Three images were discarded for having high reprojection errors while 17 images were used to compute the intrinsic parameters.

Fig. 3.
figure 3

Projected points on the chessboard and sawtooth images

With the intrinsic parameters computed, the sawtooth images were added to the calibrator to compute the extrinsic parameters. The pose estimation parameters (width and height of sawtooth, Gaussian values, Threshold values) were inserted. The calibrator detects the tangents of the sawtooth images (see Fig. 3) and computes the extrinsic parameters. The radial and tangential distortions can be computed using Eqs. (1) and (2) respectively [11, 12]:

$$ \left[ {\begin{array}{*{20}c} {\delta u_{i}^{\left( r \right)} } \\ {\delta v_{i}^{\left( r \right)} } \\ \end{array} } \right]\, = \,\left[ {\begin{array}{*{20}c} {k_{1} } \\ {k_{2} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {\tilde{u}_{i} r_{i}^{2} } & {\tilde{u}_{i} r_{i}^{4} } \\ {\tilde{v}_{i} r_{i}^{2} } & {\tilde{v}_{i} r_{i}^{4} } \\ \end{array} } \right] $$
(1)
$$ \left[ {\begin{array}{*{20}c} {\delta u_{i}^{\left( t \right)} } \\ {\delta v_{i}^{\left( t \right)} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {p_{1} } \\ {p_{2} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {2\tilde{u}_{i} \tilde{v}_{i} } & {(r_{i}^{2} + 2\tilde{u}_{i}^{2} )} \\ {(r_{i}^{2} + 2\tilde{v}_{i}^{2} )} & {2\tilde{u}_{i} \tilde{v}_{i} } \\ \end{array} } \right] $$
(2)

where k1, k2 = coefficients of radial distortion; p1, p2 = coefficients of tangential distortion; u, v = image coordinates; ũ,  = image projection; δu, δv = u − u0, v − v0; u0, v0 = image centre/principal point; \( r = \sqrt {\tilde{u}^{2} + \tilde{v}^{2} } \); i = number of the images (i.e. 1, 2, 3, …).

As a rule, the mean reprojection errors of less than 0.2 mm are considered acceptable. The Root Mean Squared Error (RMSE), which is the metric used for the reprojection error, along the X and Y coordinates of the image are 0.0015 mm and 0.0009 mm respectively. The mean RMSE of reprojection is 0.0018 mm. The calibration report was exported as an XML file and used to compensate the effects of lens distortion during the pottery scanning process.

3.2 Scanning the Pottery

With the calibration completed and the line laser and camera synchronised, the system was ready for acquiring the data. It was ensured that the laser line was parallel to the edge of the sliding table as it reduces distortion and aids the alignment process. The goal was to acquire the pottery’s point cloud as the sliding table moves from one end to another in order to get every viewing angle as much as possible as shown in Fig. 4.

Fig. 4.
figure 4

Samples of the scanned pottery

Placing the pottery on the object mount, multiple exposures of the pottery’s complete surface were acquired using the acquisition setup with a 16 GB RAM Core i7-6500U laptop @ 2.50 GHz. The scanning and the conversion of the data to point cloud was carried out by a script written for the Daheng camera. Due to occlusion occurring, many exposures of different viewpoints were acquired so that the occluded parts will be compensated for after registering the point clouds. While many exposures were acquired, about 24 were used for the reconstruction process. A light source was directed at the pottery at about 400 mm distance to ensure that colour and shape information are well captured.

4 The Registration Process

The goal of the registration process is to find the correspondence, rigid transformation and best alignment between point clouds. Given two point clouds, model P = {p1, p2, …, pm} and target Q = {q1, q2, …, qn}, in 3D space which contain m and n points respectively. If p ⊂ P and q ⊂ Q are the overlapping points between the two point clouds, then a rigid transformation T applied to P such that the distance between P and Q is minimised, results in the best alignment between the point clouds and is expressed as:

$$ {\mathbf{y}} = {\mathbf{Rx}} + {\mathbf{t}} $$
(3)

where xp, yq, R is the 3 × 3 rotation matrix and t is the 3 × 1 translation vector.

A rigid registration can thus be found by minimising the following equation:

$$ \sum\nolimits_{i = 1}^{M} {\left| {\left| {Tp_{i} - q_{i} } \right|} \right|^{2} } ,\quad \quad q_{i} = \mathop {{ \arg }\;{ \hbox{min} }}\limits_{q \in Q} \left| {\left| {Tp_{i} - q} \right|} \right| $$
(4)

where p is the model point set and q is the target point set.

The registration process can be achieved with different techniques and methods as discussed by [15, 16]. But the most popular method that has formed the foundation for more improved ones is the Iterative Closest Point (ICP) algorithm [17]. However, it tends to be susceptible to local minima. Rusinkiewicz and Levoy [18] categorise the ICP process into selection of points, matching of points, weighting of pairs, rejecting pairs, error metric and minimisation. Because of the differences in objects to be registered, such as their geometry, and the unique challenges that befall them, as well as the huge point sets that might be involved, it is usually difficult to develop a “one-size-fits-all” solution for point cloud registration in general. Hence, certain works are refined or improved to suit the needs of the data that is involved [18, 19]. This has led to the basis for this study.

4.1 Data Pre-processing

The scanned point clouds, in Sect. 3.2, were processed with the CloudCompare software to obtain clean point clouds. The segmentation of the point clouds was carried out to remove unusable parts of the cloud from the useful part. The number of points of the point clouds was reduced by down-sampling. Down-sampling of the point clouds was based on space sampling, ensuring that the points are uniformly distributed to get a good estimation that contains only inliers. Points that do not have finite normal and enough neighbours in a certain radius (outliers in the scan) were removed. The outliers were removed based on the number of neighbours around a point and within a radius of 0.2 mm. The sampling or filtering approach usually affects the convergence rate of the point cloud during registration.

Also, the normal of the point clouds were computed for surface correspondence, thus ensuring a better alignment process. Principal Component Analysis (PCA) [20] was used to determine the normal vectors of the point clouds because PCA algorithms can analyse the variation of points in the three directions. The normal vector corresponds to the direction with minimum variation. From the eigen decomposition of the covariance matrix of the nearest neighbours, the eigenvector corresponding to the minimum eigenvalue represents the normal vector. The covariance matrix can be calculated from the following equation:

$$ C = \frac{1}{k}\sum\nolimits_{i = 1}^{k} {\left( {p_{i} - \tilde{p}} \right)\left( {p_{i} - \tilde{p}} \right)^{T} } $$
(5)

where k is the number of nearest neighbours in the vicinity of pi and \( \tilde{p} \) is the mean or centroid of all k neighbours.

The pseudocode below shows how to compute the normals with PCA.

figure a

4.2 Features Extraction

The feature extraction entails the key-point extraction and feature description. Key-points were extracted based on distinctive properties inherent on the point cloud surface. Such properties include the colour and principal curvature of the surface normal. The largest value of the RGB components was computed by finding the products of two curvature to get the local maxima. The key-points were determined by computing the local maxima of the curvature. In essence, by analysing nearest neighbours around the point of interest within a certain radius and curvature threshold, the principal curvature for all points were computed and the covariance matrix established as stated in Eq. (5) under Sect. 4.1. For the oil lamp for example, the curvature was computed using points in a radius of 0.8 mm, and local maxima within a radius of 1 mm. This resulted in about 400–500 key-points per view.

Also, a Local Reference Frame (LRF) was computed for the key-points. The normal at the key-point was used as Z axis of the LRF, the X axis is the principal curvature direction, while the Y axis is the cross product of the Z and Y axes. This leaves the problem of “inverted” reference frames since curvature directions don’t truly have a sign. However, the sign of the axis was chosen as the one where points are further away from the tangent plane defined by the normal at the key-point. This approach results in 20% of the key-points with reference frames inverted and, on average, 1.5° of error on the Z axis and 7–8° on the X and Y axes. These increases of views have a very small overlap area with large occlusion. For the oil lamp, the local reference frame was computed using points around the key-point with a maximum distance of 1.2 mm. For the sherd, the number of key-points to be extracted were increased.

Furthermore, decomposing the covariance matrix with Singular Value Decomposition (SVD) results in three eigenvalues, λ1, λ2, λ3, and their corresponding eigenvectors. These eigenvalues and eigenvectors describe the neighbourhood features of the key-point that were computed using ColorSHOT descriptor [21] and the LRF at a radius of 4 μm for every key-point. ColorSHOT was chosen for its robustness and accuracy. The key-points were found on both views with descriptor distances (Euclidean distance) smaller than a threshold. The key-points that had small distances compared to most key-points were not considered. For example, two key-points with few points around them will both have descriptors full of zeros but are not necessarily the same key-point. Also, empty descriptor spatial bins were not considered when computing descriptor distances, mostly because it should make this step a little more robust to occlusion.

4.3 Correspondence Matching

Having computed the descriptors from the key-points of the point clouds, point correspondences were established using nearest neighbour search in the feature space of the key-points. Adapting the method of Lei et al. [22], angular vectors between the normal of the descriptor were defined and correspondences were clustered into groups. About 2,000 correspondences were established between views with large overlaps, with about 150 correspondences being correct. Likewise, about 400 correspondences were established between views with small overlaps, with about 6–7 correspondences being correct. To attempt to improve this, some basic geometry consistency criteria was imposed by forcing the same distances between correspondences to create groups from them. For example, for two point pairs (pi, qi) and (pj, qj), distance d = ||pi − pj|| = ||qi − qj|| for the pairs to be valid. By creating all the possible groups with more than 1,000 correspondences, this step becomes very slow. Hence, the best correspondences were used to create a completely new group and then look for compatible correspondences among the others. The idea behind this is that if there are some correct correspondences there should be at least a couple of them among the best correspondences. ICP was then used to do a pairwise registration on the point clouds as shown in Fig. 5.

Fig. 5.
figure 5

Pairwise registration of two point clouds

4.4 Global Registration

With the pairwise matching of the point clouds done, a coarse alignment was done for all pairs of point clouds, resulting in a transformation that brings all the clouds to the reference frame of the “reference” view. Following the method of Pulli [23], the global registration was improved, and the registration error that was propagated and accumulated with all the registration pairs was redistributed. With this method, the pairwise registrations converge much faster than the global registration, thereby resulting in “pairwise constraints” between the point clouds. These constraints were later applied to register all the point clouds. The output is shown in Fig. 6.

Fig. 6.
figure 6

Global registration of oil lamp pottery and potsherd

5 Discussion

While one solution may not solve all problems, as stated earlier, this work intended to bridge a gap that exists specifically with archaeological pottery. As was seen from the extraction of key-points from the lamp and its sherd, flexibility played a role. When present parameters did not satisfy the same goal for both lamp and sherd, the parameter was adjusted so that more key-points could be extracted from the sherd, as much as possible. Furthermore, from applying the method of [24] (see Fig. 7), it can be seen that though their method worked for their study, it did not attain accurate result with the data used in this study.

Fig. 7.
figure 7

Output of the method of [24]

Also, the local minimum problem was managed to a reasonable extent as a result of doing a coarse registration. This improves efficiency and robustness.

However, the time complexity of this study needs improvement and will be an area of focus on subsequent studies.

6 Conclusion

While physical documentation of artefacts is a way that archaeologists preserve artefacts, applying digital means of documentation would help in preserving artefacts much longer. The process of virtualising artefacts will play a key role in preserving and documenting valuable artefacts. The process of virtualising artefacts using point cloud data was presented in this paper. The artefact was acquired as point cloud and pre-processed to have a clean cloud. The pre-processing step includes segmentation, normal computation, down-sampling and boundary point computation. Thereafter, key-points were detected and extracted from the point clouds, and descriptors computed using point and colour information. The approach presented in this paper focuses on improving accuracy and optimising the cost function to have an optimal result for the pottery’s profile. However, while the final registration for the pottery and its sherd was successful, it was observed that the robustness of the descriptor dropped due to occlusion.

For further studies, the point clouds will be acquired to reduce noise and occlusion and increase robustness. More data of different shapes will be used to test this approach, as well as rigorous evaluation in general. Also, an attempt to reassemble the pottery and its sherd will be investigated.