Detecting image orientation based on low-level visual content

https://doi.org/10.1016/j.cviu.2003.10.006Get rights and content

Abstract

Accurately and automatically detecting image orientation is of great importance in intelligent image processing. In this paper, we present automatic image orientation detection algorithms based on both the luminance (structural) and chrominance (color) low-level content features. The statistical learning support vector machines (SVMs) are used in our approach as the classifiers. The different sources of the extracted image features, as well as the binary classification nature of SVM, require our system to be able to integrate the outputs from multiple classifiers. Both static combiner (averaging) and trainable combiner (also based on SVMs) are proposed and evaluated in this work. Furthermore, two rejection options (regular and re-enforced ambiguity rejections) are employed to improve orientation detection accuracy by sieving out images with low confidence values during the classification. Large amounts of experiments have been conducted on a database of more than 14,000 images to validate our approaches. Discussions and future directions for this work are also addressed at the end of the paper.

Introduction

The advances in digital imaging and storage technologies have made digital photography affordable. Processing and management of digital photos, either captured from photo scanners or digital cameras, are becoming essential functions of personal computers and intelligent home appliances. To input a photo into a digital album, first of all, the digitized or scanned image is required to be displayed in its correct orientation.

Currently, the operation of detecting image orientation is usually performed manually, which is very tedious and time consuming. This issue is becoming more problematic since people are taking more and more photo due to virtually zero cost of photo taking when digital cameras are used. Several commercial products are now also available to deskew a scanned image based on detected outer edges of the pictures. The technique involved is to first identify boundaries of the photograph placed on a scanner plate. The photo is determined to be tilted at some angle between zero degree (0°) and forty-five degrees (45°) with respect to a reference coordinate, i.e., the horizontal or vertical axis. The image is then aligned according to this detected tilt angle. Although such conventional technique can rotate a skewed image so that its outer edges are parallel or perpendicular to the reference axes, whether the content (i.e., the subject related matter) of the aligned image is properly oriented has not been taken into consideration. The goal of the work presented in this paper is to develop automatic image orientation detection systems based on statistical classification of the low-level image content. This goal is not easily achievable as it may appear.

Since image orientation detection is a relatively new topic, the literature about this is quite sparse. Most of the related topics have focused on document page orientation detection [3], [5], [12]. A simple and rapid algorithm for medical chest image orientation detection has been developed in [8] by comparing against a standard. Determining image orientation based on available models thorough matching has also been used for aerial images [20]. Although the mentioned studies perform well for their specific purposes, they are not suitable for general image orientation detection, which is our aim here. The most related work that investigates this problem was recently presented in [23], [24], where a Bayesian learning framework is employed to classify the image orientations according to the color feature extracted from the images. However, in general, color itself will not be discriminative enough for general image orientation detection. The adoption of more visual information becomes a necessity in order to accurately detecting the image orientation.

Humans tend to use high-level concepts in daily life. On the other hand, what current computer vision techniques can automatically extract from images are mostly low-level features. This is also true for the problem here. Humans identify the correct orientation of an image through contextual information or object recognition, which is very hard for the computer to achieve in the real world. A key idea in this paper is to avoid the needs of semantic object recognition, and we choose to rely on low-level visual features [14], [18], [19] for image orientation detection and a learning-based the classification scheme to perform the task. Effective classification of image orientation often requires a combination of different visual features. Among the most frequently referred “visual content”, the color feature represents the chrominance information; both texture and shape capture the luminance information.

In this work, we have empirically adopted the edge-based structural features as luminance information, and the color moment features as chrominance information. The two sources of information are incorporated into our recognition system to provide complimentary information for robust image orientation detection [26]. Support vector machines (SVMs) based classifiers are first utilized in parallel, and then followed by a classifier combiner. Both the static and trainable combiners are investigated in this paper. Moreover, two rejection options (regular and re-enforced ambiguity rejections) are introduced into our system to improve orientation detection accuracy by screening out images with low confidence values. Experiments performed on a large database further validate the effectiveness of our approaches, in terms of classification methods, selected features, classifier architectures, rejection schemes, and classification criteria. The results show that SVM approach outperforms learning vector quantization (LVQ) with the same color moments feature (accuracy SVM 73.7% vs. LVQ 69.2%). Also, the use of complimentary information sources leads to an improvement of about 5% compared to the use of either chrominance color features or luminance edge-based structural features. The remainder of this paper is organized as follows. Section 2 describes the problem. Section 3 provides the details for visual feature extraction. Issues related to the classifier design and the system architecture are investigated in Section 4. In Section 5, the proposed approaches are validated with extensive experiments, using a database of images from Corel photo gallery. Directions of our future work are discussed in Section 6. Finally, Section 7 concludes.

Section snippets

Problem statement

As in [23], we define the correct orientation of an image as the orientation in which the scene, captured by the image, originally occurred. Because of the camera rotation while taking a picture or mis-placement of the photograph on a scanner, a digital image could be incorrectly oriented. However, most of the pictures are placed on a scanner with their boundaries aligned with those of the scanner plate. When this condition is not satisfied, that is, when a picture is placed on a scanner at a

Visual features

As mentioned in the introduction, we rely on low-level visual content, which can be automatically extracted by computer, for image orientation detection. The performance of our system is inherently constrained by the features used to represent the images in the database. Specifically, what we assume to be an effective feature, how we combine multiple features, and on which scale the features shall be computed, are among the important factors for the detection to be successful.

Classification systems

The systems and architectures for our orientation detection are based on the statistical learning support vector classifiers.

Construction of training and test images

The image data used in our experiments are from the Corel photo gallery. Since all the images in the gallery are in the correct orientation (0°), we need to rotate these images to the appropriate orientations for both training and test purpose.

Future research directions

General image orientation is an interesting, new, yet difficult problem. The experiments conducted above have shown promising results. The accuracies of the proposed approaches are around 90% at 20% rejection rate. Clearly, future enhancements are needed to further improve the accuracies so that the current image orientation detection can be of practical use.

Conclusions

We have proposed the automatic approaches for content-based image orientation detection with SVMs. Extensive experiments on a database of more than 14,000 images were conducted to evaluate the system. From experimental results, it is shown that SVM approach outperforms LVQ with the same color moments feature (accuracy SVM 73.7% vs. LVQ 69.2%), which is due to the good generalization and learning abilities of SVM. Also, the combination of complimentary information sources, chrominance color

Acknowledgements

The experiments in this work were performed when the first author was a visiting researcher at Microsoft Research Asia, Media Computing Group. The authors would like to thank the anonymous reviews for comments that considerably improved the quality of this paper. They are also grateful to Aditya Vailaya for his suggestions regarding the use of the LVQ package. Finally, thanks also go to Xiangrong Chen, Chunhui Hu, Lie Lu, Lie Gu, and Jie Yan for helpful discussions related to this project.

References (27)

  • M.G. Evanoff, K.M. McNeill, Computer recognition of chest image orientation, in: Proc. Eleventh IEEE Symp. on...
  • A.K Jain et al.

    Statistical pattern recognition: a review

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2000)
  • Cited by (0)

    View full text