Abstract
Mirrors are everywhere in our daily lives. Existing computer vision systems do not consider mirrors, and hence may get confused by the reflected content inside a mirror, resulting in a severe performance degradation. However, separating the real content outside a mirror from the reflected content inside it is non-trivial. The key challenge is that mirrors typically reflect contents similar to their surroundings, making it very difficult to differentiate the two. In this article, we present a novel method to segment mirrors from a single RGB image. To the best of our knowledge, this is the first work to address the mirror segmentation problem with a computational approach. We make the following contributions: First, we propose a novel network, called MirrorNet+, for mirror segmentation, by modeling both contextual contrasts and semantic associations. Second, we construct the first large-scale mirror segmentation dataset, which consists of 4,018 pairs of images containing mirrors and their corresponding manually annotated mirror masks, covering a variety of daily-life scenes. Third, we conduct extensive experiments to evaluate the proposed method and show that it outperforms the related state-of-the-art detection and segmentation methods. Fourth, we further validate the effectiveness and generalization capability of the proposed semantic awareness contextual contrasted feature learning by applying MirrorNet+ to other vision tasks, i.e., salient object detection and shadow detection. Finally, we provide some applications of mirror segmentation and analyze possible future research directions. Project homepage: https://mhaiyang.github.io/TOMM2022-MirrorNet+/index.html.
- [1] . 2017. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 12 (2017), 2481–2495.Google ScholarCross Ref
- [2] . 2018. The Lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In CVPR. 4413–4421.Google Scholar
- [3] . 2017. Matterport3D: Learning from RGB-D data in indoor environments. In 3DV. 667–676.Google Scholar
- [4] . 2017. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2017), 834–848.Google ScholarCross Ref
- [5] . 2018. Reverse attention for salient object detection. In ECCV. 234–250.Google Scholar
- [6] . 2021. Y-Net: Dual-branch joint network for semantic segmentation. ACM Trans. Multim. Comput. Commun. Applic. 17, 4 (2021), 1–22.Google ScholarDigital Library
- [7] . 2020. Global context-aware progressive aggregation network for salient object detection. In AAAI, Vol. 34. 10599–10606.Google Scholar
- [8] . 2014. Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37, 3 (2014), 569–582.Google ScholarDigital Library
- [9] . 2018. R3Net: Recurrent residual refinement network for saliency detection. In IJCAI. 684–690.Google Scholar
- [10] . 2018. Context contrasted feature and gated multi-scale aggregation for scene segmentation. In CVPR. 2393–2402.Google Scholar
- [11] . 2020. PraNet: Parallel reverse attention network for polyp segmentation. In MICCAI. 263–273.Google Scholar
- [12] . 2019. Attentive feedback network for boundary-aware salient object detection. In CVPR. 1623–1632.Google Scholar
- [13] . 2018. Coarse-to-fine image co-segmentation with intra and inter rank constraints. In IJCAI. 719–725.Google Scholar
- [14] . 2022. ISDNet: Integrating shallow and deep networks for efficient ultra-high resolution segmentation. In CVPR. 4361–4370.Google Scholar
- [15] . 2019. Dynamic multi-scale filters for semantic segmentation. In ICCV. 3562–3572.Google Scholar
- [16] . 2017. Mask R-CNN. In ICCV. 2961–2969.Google Scholar
- [17] . 2019. Deeply supervised salient object detection with short connections. IEEE Trans. Pattern Anal. Mach. Intell. 41, 4 (2019).Google ScholarDigital Library
- [18] . 2018. Direction-aware spatial context features for shadow detection. In CVPR. 7454–7462.Google Scholar
- [19] . 2022. Frequency-aware camouflaged object detection. ACM Trans. Multim. Comput. Commun. Applic. (2022).Google ScholarDigital Library
- [20] . 2011. Efficient inference in fully connected CRFs with Gaussian edge potentials. NeurIPS 24 (2011).Google Scholar
- [21] . 2015. Visual saliency based on multiscale deep features. In CVPR. 5455–5463.Google Scholar
- [22] . 2018. Contour knowledge transfer for salient object detection. In ECCV. 355–370.Google Scholar
- [23] . 2014. The secrets of salient object segmentation. In CVPR. 280–287.Google Scholar
- [24] . 2020. Single-stage instance segmentation. ACM Trans. Multim. Comput. Commun. Applic. 16, 3 (2020), 1–19.Google ScholarDigital Library
- [25] . 2021. Residual refinement network with attribute guidance for precise saliency detection. ACM Trans. Multim. Comput. Commun. Applic. 17, 3 (2021), 1–19.Google ScholarDigital Library
- [26] . 2018. PiCANet: Learning pixel-wise contextual attention for saliency detection. In CVPR. 3089–3098.Google Scholar
- [27] . 2015. Fully convolutional networks for semantic segmentation. In CVPR. 3431–3440.Google Scholar
- [28] . 2022. Scenario-aware recurrent transformer for goal-directed video captioning. ACM Trans. Multim. Comput. Commun. Applic. 18, 4 (2022), 1–17.Google ScholarDigital Library
- [29] . 2022. Glass segmentation using intensity and spectral polarization cues. In CVPR. 12622–12631.Google Scholar
- [30] . 2021. Camouflaged object segmentation with distraction mining. In CVPR. 8772–8781.Google Scholar
- [31] . 2021. Exploring dense context for salient object detection. IEEE Trans. Circ. Syst. Vid. Technol. 32, 3 (2021), 1378–1389.Google ScholarCross Ref
- [32] . 2020. Don’t hit me! Glass detection in real-world scenes. In CVPR. 3687–3696.Google Scholar
- [33] . 2022. Large-field contextual feature learning for glass detection. IEEE Trans. Pattern Anal. Mach. Intell. (2022).Google ScholarCross Ref
- [34] . 2021. Fine-grained visual textual alignment for cross-modal retrieval using transformer encoders. ACM Trans. Multim. Comput. Commun. Applic. 17, 4 (2021), 1–23.Google ScholarDigital Library
- [35] . 2011. AprilTag: A robust and flexible visual fiducial system. In ICRA. 3400–3407.Google Scholar
- [36] . 2022. Zoom in and out: A mixed-scale triplet network for camouflaged object detection. In CVPR. 2160–2170.Google Scholar
- [37] . 2020. Multi-scale interactive network for salient object detection. In CVPR. 9413–9422.Google Scholar
- [38] . 2020. Inception u-net architecture for semantic segmentation to identify nuclei in microscopy cell images. ACM Trans. Multim. Comput. Commun. Applic. 16, 1 (2020), 1–15.Google ScholarDigital Library
- [39] . 2019. BASNet: Boundary-aware salient object detection. In CVPR. 7479–7489.Google Scholar
- [40] . 2017. DeshadowNet: A multi-context embedding deep network for shadow removal. In CVPR. 4067–4075.Google Scholar
- [41] . 2012. Indoor segmentation and support inference from RGBD images. In ECCV. 746–760.Google Scholar
- [42] . 2022. Disentangle saliency detection into cascaded detail modeling and body filling. ACM Trans. Multim. Comput. Commun. Applic. (2022).Google Scholar
- [43] . 2019. Selectivity or invariance: Boundary-aware salient object detection. In CVPR. 3799–3808.Google Scholar
- [44] . 2022. Mirror detection with the visual chirality cue. IEEE Trans. Pattern Anal. Mach. Intell. (2022), 1–13.Google Scholar
- [45] . 2021. Night-time scene parsing with a large real dataset. IEEE Trans. Image Process. 30 (2021), 9085–9098.Google ScholarDigital Library
- [46] . 2022. Bi-directional object-context prioritization learning for saliency ranking. In CVPR. 5882–5891.Google Scholar
- [47] . 2022. Learning to detect instance-level salient objects using complementary image labels. International Journal of Computer Vision. 130, 3 (2022), 729–746.Google Scholar
- [48] . 2017. Attention is all you need. NeurIPS 30 (2017).Google Scholar
- [49] . 2016. Large-scale training of shadow detectors with noisily-annotated shadow examples. In ECCV. 816–832.Google Scholar
- [50] . 2017. Learning to detect salient objects with image-level supervision. In CVPR. 136–145.Google Scholar
- [51] . 2018. Detect globally, refine locally: A novel approach to saliency detection. In CVPR. 3127–3135.Google Scholar
- [52] . 2019. Salient object detection with pyramid attention and salient edges. In CVPR. 1448–1457.Google Scholar
- [53] . 2018. Non-local neural networks. In CVPR. 7794–7803.Google Scholar
- [54] . 2020. F\(^3\)Net: Fusion, feedback and focus for salient object detection. In AAAI, Vol. 34. 12321–12328.Google Scholar
- [55] . 2018. Reconstructing scenes with mirror and glass surfaces.ACM Trans. Graph. 37, 4 (2018), 102–1.Google ScholarDigital Library
- [56] . 2018. CBAM: Convolutional block attention module. In ECCV. 3–19.Google Scholar
- [57] . 2019. Cascaded partial decoder for fast and accurate salient object detection. In CVPR. 3907–3916.Google Scholar
- [58] . 2021. Intensity-aware single-image deraining with semantic and color regularization. IEEE Trans. Image Process. 30 (2021), 8497–8509.Google ScholarDigital Library
- [59] . 2020. Learning to restore low-light images via decomposition-and-enhancement. In CVPR.Google Scholar
- [60] . 2021. Exploring image enhancement for salient object detection in low light images. ACM Trans. Multim. Comput. Commun. Applic. 17, 1s (2021), 1–19.Google ScholarDigital Library
- [61] . 2013. Hierarchical saliency detection. In CVPR. 1155–1162.Google Scholar
- [62] . 2013. Saliency detection via graph-based manifold ranking. In CVPR. 3166–3173.Google Scholar
- [63] . 2019. Where is my mirror? In ICCV. 8809–8818.Google Scholar
- [64] . 2019. DRFN: Deep recurrent fusion network for single-image super-resolution with large factors. IEEE Trans. Multim. 21, 2 (2019), 328–337.Google ScholarDigital Library
- [65] . 2021. A densely connected network based on U-Net for medical image segmentation. ACM Trans. Multim. Comput. Commun. Applic. 17, 3 (2021), 1–14.Google ScholarDigital Library
- [66] . 2022. Progressive glass segmentation. IEEE Trans. Image Process. 31 (2022), 2920–2933.Google ScholarCross Ref
- [67] . 2021. A two-stage attentive network for single image super-resolution. IEEE Trans. Circ. Syst. Vid. Technol. 32, 3 (2021), 1020–1033.Google ScholarCross Ref
- [68] . 2020. Multi-context and enhanced reconstruction network for single image super resolution. In ICME. 1–6.Google Scholar
- [69] . 2022. Progressive meta-learning with curriculum. IEEE Trans. Circ. Syst. Vid. Technol. 32, 9 (2022), 5916–5930.Google Scholar
- [70] . 2018. A bi-directional message passing model for salient object detection. In CVPR. 1741–1750.Google Scholar
- [71] . 2018. Progressive attention guided recurrent network for salient object detection. In CVPR. 714–722.Google Scholar
- [72] . 2018. ICNet for real-time semantic segmentation on high-resolution images. In ECCV. 405–420.Google Scholar
- [73] . 2017. Pyramid scene parsing network. In CVPR. 2881–2890.Google Scholar
- [74] . 2019. Pyramid feature attention network for saliency detection. In CVPR. 3085–3094.Google Scholar
- [75] . 2017. Scene parsing through ade20k dataset. In CVPR. 633–641.Google Scholar
- [76] . 2020. Interactive two-stream decoder for accurate and fast saliency detection. In CVPR. 9141–9150.Google Scholar
- [77] . 2018. Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In ECCV. 121–136.Google Scholar
- [78] . 2021. Mitigating intensity bias in shadow detection via feature decomposition and reweighting. In ICCV. 4702–4711.Google Scholar
Index Terms
- Mirror Segmentation via Semantic-aware Contextual Contrasted Feature Learning
Recommendations
Interactive Approximate Rendering of Reflections, Refractions, and Caustics
Reflections, refractions, and caustics are very important for rendering global illumination images. Although many methods can be applied to generate these effects, the rendering performance is not satisfactory for interactive applications. In this paper,...
Simulation of water surface using current consumer-level graphics hardware
Water surface visualization is an important research topic in computer graphics. This paper presents a novel method of water surface simulation by Secondary Distorted Textures (SDT), which realistically simulates and visualizes the reflection and ...
Real-time multi-perspective rendering on graphics hardware
SIGGRAPH '06: ACM SIGGRAPH 2006 SketchesMulti-perspective rendering has a variety of applications; examples include lens refraction, curved mirror reflection, caustics, as well depiction and visualization. However, multi-perspective rendering is not yet practical on polygonal graphics ...
Comments