Coarse-to-fine dot array marker detection with accurate edge localization for stereo visual tracking

doi:10.1016/j.bspc.2014.09.008

Biomedical Signal Processing and Control

Volume 15, January 2015, Pages 49-59

https://doi.org/10.1016/j.bspc.2014.09.008 Get rights and content

Highlights

•
We present a coarse-to-fine dot array marker detection algorithm which can extract dot features with high accuracy and low uncertainty.
•
Sub-pixel edge point localization of the dot contour is performed by searching the zero-crossing in the convolution with a Laplacian-of-Gaussian kernel.
•
The marker detection algorithm yielded a feature detection error of less than 0.1 pixel with real-time performance.
•
The uncertainties in both localizing static 2-D dot features and 3-D pose tracking were obviously reduced by performing the sub-pixel localization.

Abstract

We present a coarse-to-fine dot array marker detection algorithm which can extract dot features with high accuracy and low uncertainty. The contribution of this paper is twofold: one is a configurable dot array marker detection framework which enables real-time multi-marker tracking with compact marker size (coarse detection); the other is a closed-form sub-pixel edge localization method including the formulation and the implementation (fine localization). The marker pattern together with the dot contours is detected in a fast but coarse way for efficiency consideration, using simple thresholding and hierarchical contour analysis. If the marker pattern matches with one of predefined marker descriptors, sub-pixel edge point localization of the dot contour is performed within the detected marker region by searching the zero-crossing in the convolution of the marker image with a Laplacian-of-Gaussian (LoG) kernel. A closed-form solution is proposed to localize the “true” edge point in a 3 × 3 neighborhood of a candidate pixel by solving a quartic equation. The dot center is finally extracted by ellipse fitting and re-ordered according to an orientation indicator. The algorithm was evaluated against both synthetic and real image data, and also in real applications where stereo visual trackers were implemented using the proposed marker detection algorithm. Experimental results show that (1) the marker detection algorithm yielded a feature detection error of less than 0.1 pixel with real-time performance; (2) the uncertainties in both localizing static 2-D dot features and 3-D pose tracking were obviously reduced by performing the sub-pixel localization; and (3) the feasibility of the marker tracking under stereo laparoscopic views was confirmed in an in vivo animal experiment.

Introduction

Stereo visual tracking employs a calibrated stereo camera to localize a well-defined visual pattern (i.e., a visual marker) in real time with six degrees of freedom (DOF). The tracking algorithms follow a general framework: (1) detecting features of the visual marker in the stereo images and matching them over the image pair, (2) retrieving features’ 3-D information by triangulation, and (3) fitting the marker's geometry to obtain the rotation and translation. The salient features encoded in a visual marker are usually corners and edges. Therefore, a visual marker commonly consists of square or/and dot patterns. Compared with a square which can provide four corners, a dot contains less information (only its center). However, the dot center is retrieved by fitting (many) edge points on dot's contour (ellipse), and this can reduce the localization uncertainty incurred by image sensor noise. Of the visual markers, planar markers are most frequently used due to the ease of manufacture (printout), wide viewing angle, and minimum space occupation. By virtual of the current printing technology, a printed marker has enough geometric accuracy for most applications.

Other than stereo visual tracking, a visual marker can also be used for camera calibration [1], [2], (single camera) pose estimation [3] and some high level application such as augmented reality (AR) [4] and visual servoing [5]. Stereo visual tracking can be considered as a special case of pose estimation by introducing more constraints due to a second camera, which should be more accurate than single camera pose estimation under the same configuration. The essence of a visual marker in these computer vision tasks is to provide 3-D/2-D correspondences for solving a linear or nonlinear minimization problem. The first step relating to the visual marker is marker detection, i.e., extracting the encoded features and making them understandable. The accuracy and uncertainty of feature extraction will largely influence the final output according to error propagation. Take the stereo visual tracking as an example, 3-D triangulation error of a feature in the depth direction is sensitive to the feature disparity especially when a small baseline or the marker is far from the camera. The uncertainty of localizing 2-D features will be amplified through the triangulation procedure, resulting in large uncertainty in determining the depth. Assume we are using the stereo tracking results to visualize a tracking object in a virtual scene (e.g., in navigation systems), large tracking uncertainty leads to obvious random fluctuation of the displayed tracking object even if it actually remains static.

In the context of marker-based surgical instrument tracking under endoscopic view in minimally invasive surgery (MIS), a visual marker is attached to the surgical instrument (e.g., forceps manipulator [6] or endoscopic ultrasound probe [7], [8]) whose pose with respect to the endoscope inside the body is estimated by either PnP techniques (for a monocular endoscope) or stereo tracking (for a stereo endoscope). Limited by the small operative field inside the body, the visual marker is restricted to a small size, which will deteriorate the accuracy and increase the uncertainty at the target point (e.g., the instrument tip) far from the marker features [9], [10]. Additionally, the baseline of a stereo endoscope is typically small (several millimeters), which makes the triangulation procedure subject to image noise. Eventually, a small uncertainty in localizing feature points may cause relatively large uncertainty in the depth localization, which will further result in large angular uncertainties around the marker plane. One improvement of visual marker detection is to increase the accuracy and reduce the uncertainty of feature localization in the presence of noise. For the tracking task, the detection time is also an important issue which has to be considered.

There are several open source visual marker systems available. ARToolkit is an open source c/c++ library developed many years ago for building augmented reality (AR) applications by tracking a planar AR marker using pose estimation techniques [11]. The used AR marker is a planar pattern enclosed by a black rectangle frame on a white background. The corners of the rectangle are extracted by edge line fitting and used for pose estimation. The estimated pose is further used to overlay a virtual object on the video stream to create an AR scene. Inspired by ARToolkit, ARTag marker system was proposed in 2005 for AR applications, with improvement on the marker pattern to improve the false detection rate and inter-marker confusion rate [12]. ARToolkit and ARTag are the same in the way of pose estimation using extracted four corners but different in the way of carrying information for marker recognition. Zhang evaluated the performance of several AR marker systems and reported that the standard deviation of localizing a static feature point varied from 0.26 to 0.57 pixel [13]. Assume the error follows a normal distribution, the variation of locating the same feature point in a static state is more than 1 pixel due to the image sensor noise. The uncertainty of 1 pixel is quite large for both pose estimation and stereo tracking, especially when the marker is small and/or far from the camera.

The computer vision open source library OpenCV [14] utilizes a black/white chessboard pattern for camera calibration. The corners of the chessboard pattern (X corner) are detected at pixel level using quadrilateral detection, and then iteratively optimized toward the saddle point at sub-pixel level [15]. Other chessboard corner detection methods can also be found elsewhere [16], [17], [18], [19]. We noticed that more attentions have been paid to chessboard pattern detection. The reason may be the ease of detection and sub-pixel localization of an X corner. However, the uncertainty of the above sub-pixel corner localization is larger than that of dot center extraction using ellipse fitting. Because we can first localize the edge point on the dot's contour at sub-pixel level, then use an ellipse to fit these edge points and take the ellipse center as the dot center. The uncertainty of localizing the dot center is further reduced by ellipse fitting. OpenCV also supports dot grid feature detection. However, it detects the dot contour at pixel level which results in a relatively large uncertainty in localizing the dot center. Therefore, a compact marker and a corresponding detection algorithm with high accuracy and low uncertainty of feature localization are needed.

This paper presents a coarse-to-fine dot array marker detection algorithm with accurate edge point localization. The marker is detected in a cascade way for efficiency consideration. The contour of the dot is detected at pixel level in a fast way, and then the sub-pixel edge point localization is performed by searching the zero-crossing in the convolution of the image with a Laplacian-of-Gaussian (LoG) kernel [20]. The dot center is finally extracted by ellipse fitting and re-ordered according to an orientation indicator. The contribution of this paper is twofold: a compact configurable dot array marker detection framework which enables multi-marker tracking (coarse detection); and a closed-form sub-pixel edge localization method including the formulation and the implementation (fine localization). The sub-pixel edge localization method could also be used in relevant vision tasks [21], [22], [23].

Section snippets

Dot array marker

Our dot array marker is inspired by the camera calibration pattern used by the commercial machine vision software MVTec Halcon [24]. The dot array marker is a m × n dot matrix enclosed by a rectangle frame as shown in Fig. 1. The origin of the marker is chosen to be located at the center of the rectangle frame. A solid triangle at a corner serves as an orientation indicator distinguishing the x and y directions. A dot array marker hence is characterized by (m, n, d, c), where d is the spacing; c

Edge point localization

In the proposed marker detection algorithm, sub-pixel edge point localization is performed if the marker has been recognized. Because the marker recognition is a coarse procedure for time-saving consideration, we need to finely localize every edge point on the contour E_i for all i. In addition, disturbed by image sensor noise and motion artifacts, the contours extracted from a thresholded image are not true edges, however, they are supposed to be close to the true edges. Therefore, it is

Marker pose tracking

If the marker is successfully detected in a stereo image pair (left and right images), 3-D triangulation is performed to reconstruct 3-D coordinates of each dot center. Assume the stereo camera system has been calibrated and stereo-rectified so that it has a configuration of two parallel looking cameras with identical intrinsic parameter matrices:

$K = (\begin{matrix} f & 0 & c_{x} \\ 0 & f & c_{y} \\ 0 & 0 & 1 \end{matrix})$ where f is the focal length and (c_x, c_y) is the principal point of the camera. The 3-D coordinates (x, y, z) of a 2-D correspondence (x_l,

Experiments and results

We have implemented the dot array marker detection algorithm and the stereo visual tracker using c++ with the help of OpenCV. Except for the sub-pixel localization, all the operations in Algorithm 1 can be implemented by OpenCV functions. In this section, we evaluate the proposed algorithm on synthetic data and real data, respectively, and then show the real applications based on our proposed method.

Discussion and conclusion

In this paper, a coarse-to-fine marker detection algorithm with sub-pixel edge localization is presented. The dot array pattern is detected and matched to a predefined descriptor in a fast way using a simple threshold and hierarchical contour analysis. The resulting dot contours are coarse edge points which are supposed to be close to the “true” edges. If the marker has successfully matched with one of the predefined descriptors, the marker region (a bounding box containing only the marker

Conflict of interest

None.

References (37)

S. Bennett et al.
Chess – quick and robust detection of chess-board features
Comput. Vis. Image Understand.
(2014)
H. Zhou et al.
Efficient tracking and ego-motion recovery using gait analysis
Signal Process.
(2009)
W. Zhou et al.
A sparse representation based fast detection method for surface defect detection of bottle caps
Neurocomputing
(2014)
S. Suzuki et al.
Topological structural analysis of digitized binary images by border following
Comput. Vis. Graph. Image Process.
(1985)
F. Da et al.
Sub-pixel edge detection based on an improved moment
Image Vis. Comput.
(2010)
V. Berzins
Accuracy of Laplacian edge detectors
Comput. Vis. Graph. Image Process.
(1984)
S. Tan et al.
Performance of three recursive algorithms for fast space-variant Gaussian filtering
Real-Time Imaging
(2003)
Z. Zhang
A flexible new technique for camera calibration
IEEE Trans. Pattern Anal. Mach. Intell.
(2000)
J. Helferty et al.
Videoendoscopic distortion correction and its application to virtual guidance of endoscopy
IEEE Trans. Med. Imaging
(2001)
V. Lepetit et al.
EPnP: an accurate O(n) solution to the PNP problem
Int. J. Comput. Vis.
(2009)

F. Zhou et al.

Trends in augmented reality tracking, interaction and display: a review of ten years of Ismar

B. Espiau et al.

A new approach to visual servoing in robotics

IEEE Trans. Robot. Autom.

(1992)

F. Nageotte et al.

Visual servoing-based endoscopic path following for robot-assisted laparoscopic surgery

U. Jayarathne et al.

Robust intraoperative US probe tracking using a monocular endoscopic camera

P. Pratt et al.

Intraoperative ultrasound guidance for transanal endoscopic microsurgery

J. West et al.

Designing optically tracked instruments for image-guided surgery

IEEE Trans. Med. Imaging

(2004)

J. Snchez-Margallo et al.

Technical evaluation of a third generation optical pose tracker for motion analysis and image-guided surgery

H. Kato et al.

Virtual object manipulation on a table-top AR environment

Cited by (17)

A closed-loop minimally invasive 3D printing strategy with robust trocar identification and adaptive alignment
2023, Additive Manufacturing
In vivo bioprinting, as an on-site management and fabrication technique, creates artificial tissues directly at the site of tissue injury within the body and presents promising clinical potential. The combination with minimally invasive surgery (MIS) extends in vivo bioprinting to internal tissues and broadens its therapeutic targets. However, the challenging alignment procedure and non-adjustable fulcrums in MIS may contradict the positional changes of the incision under respiration-induced dynamic conditions, thereby increasing the incisional stress and the risk of postoperative herniation. Herein, we propose a closed-loop minimally invasive printing strategy based on precise incision positioning. A novel binary chromatic ring array marker is designed for robust trocar identification, which constitutes a closed-loop system with a 7-axis bioprinting robot to perform minimally invasive printing while adapting to moving incisions. Combining the developed GP/NHS hydrogel with favorable printability and enhanced tissue adhesion, we demonstrated adaptive in vivo three-dimensional printing on both planar and natural liver surfaces, presenting submillimeter printing accuracy, real-time trocar alignment at 30 Hz, and minimal contact force containment at incisions. This study provides improved adaptability for respiration-affected intracorporal operations, which promotes the development and clinical application of in vivo bioprinting and demonstrates the potential to extend to other MIS procedures
Robust and fast laparoscopic vision-based ultrasound probe tracking using a binary dot array marker
2022, Computers in Biology and Medicine
Citation Excerpt :
Therefore, the orientation of the US probe (i.e. 2D US images) with respect to the laparoscopic camera can be obtained using the orientation of the fiducial marker and US probe with respect to the with laparoscopic camera and marker, respectively. Several fiducial markers have been proposed for vision-based tracking, for example, chessboard markers [9,10], 3D random X-corner markers [11,12], dot array markers [13,14], and hybrid cylindrical markers [15]. The robustness of the chessboard and dot array markers is limited because they cannot be detected and identified when occlusions exist on the markers.
Laparoscopic vision-based ultrasound probe tracking systems have gained considerable attention in ultrasound-guided laparoscopic surgeries as replacements for external tracking systems (e.g. optical tracking and electromagnetic tracking systems), which increase cost and setting time, require additional operation space, and introduce new limitations. Most existing laparoscopic ultrasound (LUS) probe tracking systems rely on fiducial markers, which cannot easily realise fast and robust vision-based tracking in laparoscopic surgery owing to their design limitations. Therefore, we propose a novel binary dot array marker to realise a robust and fast LUS probe tracking system. The binary dot array marker comprises two dots (green and blue), which form multiple unique identification dot subarrays in the binary dot array. The binary dot array marker can be tracked when one of the identification dot subarrays is detected and identified; this novel design makes the binary dot array marker-based probe tracking system robust against occlusions during surgery. The evaluation results indicate that the proposed binary dot marker performs better in terms of robustness, computational efficiency, and tracking accuracy compared to the state-of-the-art fiducial markers used for vision-based probe tracking.
Real-time robust individual X point localization for stereoscopic tracking
2018, Pattern Recognition Letters
Citation Excerpt :
We believe an efficient and robust x point localization framework is helpful to customizing task-specific stereoscopic trackers and is important for the community. For example, with the proposed method, a stereo laparoscopic tracker [23] could be implemented to track forceps in minimally invasive surgery (MIS). In addition, by integrating a miniature projector which projects x point features into a stereo laparoscope, it is possible to reconstruct the organ surface three dimensionally intraoperatively [4].
This paper presents a real-time detection and localization method of individual x point features from cluttered background for stereoscopic tracking, using a machine learning approach. Unlike general interest point detectors such as SIFT or SURF, the proposed method is focused on stable and accurate localization of individual specially-marked objects (x points) in complex scenes at frame rate, hence is very suitable for customizing stereoscopic trackers which are widely used in surgical navigation and robotic vision. The x point localization is performed in a cascade manner. First, x point candidates are proposed over the image at cheap cost. Then, a support vector machine is used to classify the candidates according to their image descriptors. Last, a subpixel localization approach is performed to refine the remaining x points followed by a clustering procedure to eliminate duplicated x points. Finally, a stereoscopic tracker using two optical cameras is built to locate x points in the 3D space. Experimental evaluation is performed to show that the proposed method is robust against imaging noise, out-of-plane rotation, and cluttered background. The 2D localization accuracy is evaluated to be a root mean square error of 0.05 pixel with the maximum error of 0.11 pixel, yielding a frame rate of 15 fps with an image size of 1280 × 1024. The 3D localization accuracy by measuring the distance between two x points using the tracker yields a maximum mean error of 0.32 mm.
Robust, fast and accurate vision-based localization of a cooperative target used for space robotic arm
2017, Acta Astronautica
Citation Excerpt :
Ref. [23] utilized a marker that consists of concentric contrasting circles to estimate the 12 Degrees of Freedom relative state for small inspection spacecrafts. Ref. [24] presented a coarse-to-fine dot array marker tracking method and implemented it in a vivo animal experiment. Ref. [25] implemented fiducial markers around a lung tumor for dynamic tumor tracking.
When a space robotic arm deploys a payload, usually the pose between the cooperative target fixed on the payload and the hand-eye camera installed on the arm is calculated in real-time. A high-precision robust visual cooperative target localization method is proposed. Combing a circle, a line and dots as markers, a target that guarantees high detection rates is designed. Given an image, single-pixel-width smooth edges are drawn by a novel linking method. Circles are then quickly extracted using isophotes curvature. Around each circle, a square boundary in a pre-calculated proportion to the circle radius is set. In the boundary, the target is identified if certain numbers of lines exist. Based on the circle, the lines, and the target foreground and background intensities, markers are localized. Finally, the target pose is calculated by the Point-3-Perspective algorithm. The algorithm processes 8 frames per second with the target distance ranging from 0.3m to 1.5 m. It generated high-precision poses of above 97.5% on over 100,000 images regardless of camera background, target pose, illumination and motion blur. At 0.3 m, the rotation and translation errors were less than 0.015° and 0.2 mm. The proposed algorithm is very suitable for real-time visual measurement that requires high precision in aerospace.
Generation of micro-scale finite element models from synchrotron X-ray CT images for multidirectional carbon fibre reinforced composites
2016, Composites Part A: Applied Science and Manufacturing
This paper develops a new fibre tracking algorithm to efficiently locate fibre centrelines (skeletons), from X-ray Computed Tomography (X-ray CT) images of carbon fibre reinforced polymer (CFRP), which are then used to generate micro-scale finite element models. Three-dimensional images with 330 nm voxel resolution of multidirectional [+45/90/−45/0] CFRP specimens were obtained by fast synchrotron X-ray CT scanning. Conventional image processing techniques, such as a combination of filters, delineation of plies, binarisation of images, and fibre identification by local maxima and ultimate eroding points, were tried first but found insufficient to produce continuous fibre centrelines for segmentation, especially in regions with highly congested fibres. The new algorithm uses a global overlapping stack filtering step followed by a local fibre tracking step. Both steps are based on the Bayesian inference theory. The new algorithm is found capable of efficiently define fibre centrelines for the generation of micro-scale finite element models with high fidelity.
Efficient intraoral photogrammetry using self-identifying projective invariant marker
2024, International Journal of Computer Assisted Radiology and Surgery

View all citing articles on Scopus

View full text

Coarse-to-fine dot array marker detection with accurate edge localization for stereo visual tracking

Highlights

Abstract

Introduction

Section snippets

Dot array marker

Edge point localization

Marker pose tracking

Experiments and results

Discussion and conclusion

Conflict of interest

Comput. Vis. Image Understand.

Signal Process.

Neurocomputing

Comput. Vis. Graph. Image Process.

Image Vis. Comput.

Comput. Vis. Graph. Image Process.

Real-Time Imaging

A flexible new technique for camera calibration

IEEE Trans. Pattern Anal. Mach. Intell.

Videoendoscopic distortion correction and its application to virtual guidance of endoscopy

IEEE Trans. Med. Imaging

EPnP: an accurate O(n) solution to the PNP problem

Int. J. Comput. Vis.

Trends in augmented reality tracking, interaction and display: a review of ten years of Ismar

A new approach to visual servoing in robotics

IEEE Trans. Robot. Autom.

Visual servoing-based endoscopic path following for robot-assisted laparoscopic surgery

Robust intraoperative US probe tracking using a monocular endoscopic camera

Intraoperative ultrasound guidance for transanal endoscopic microsurgery

Designing optically tracked instruments for image-guided surgery

IEEE Trans. Med. Imaging

Technical evaluation of a third generation optical pose tracker for motion analysis and image-guided surgery

Virtual object manipulation on a table-top AR environment