Skip to main content

Multi-layers CNNs for 3D Model Retrieval

  • Conference paper
  • First Online:
Internet Multimedia Computing and Service (ICIMCS 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 819))

Included in the following conference series:

  • 1436 Accesses

Abstract

Due to the rapid development of 3D capturing scanners and better visual process techniques, there is a huge increase of 3D models being uploaded and captured by users. 3D model retrieval has become a hot topic in computer vision. State-of-the-art methods leverage CNNs to solve this problem. But existing CNN architectures and approaches are unable to fully exploit the information of 3D representations. In order to improve the performance of 3D object retrieval algorithms, we proposed a multi-layers CNNs (MLCNN) structure for 3D model representation. First, we combine the 12 rendered views of a 3D object into one representative view, which becomes the actual input. Second, in order to save the global and local information for each 3D model, we aggregate every convolutional layer’s feature into a multi-layers descriptor after a simple PCA compression. Finally, the Euclidean metric is leveraged to compute the similarity between two different 3D models to complete the retrieval problem. The final comparing experiments and corresponding experimental results demonstrate the superiority of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 107.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://modelnet.cs.princeton.edu/.

References

  1. Akgül, C.B., Sankur, B., Yemez, Y., Schmitt, F.J.M.: 3D model retrieval using probability density-based shape descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 31(6), 1117–1133 (2009)

    Article  MATH  Google Scholar 

  2. Ansary, T.F., Daoudi, M., Vandeborre, J.: A Bayesian 3-D search engine using adaptive views clustering. IEEE Trans. Multimedia 9(1), 78–88 (2007)

    Article  Google Scholar 

  3. Bu, S., Liu, Z., Han, J., Wu, J., Ji, R.: Learning high-level feature by deep belief networks for 3-D model retrieval and recognition. IEEE Trans. Multimedia 16(8), 2154–2167 (2014)

    Article  Google Scholar 

  4. Bustos, B., Keim, D.A., Saupe, D., Schreck, T., Vranic, D.V.: Feature-based similarity search in 3D object databases. ACM Comput. Surv. 37(4), 345–387 (2005)

    Article  Google Scholar 

  5. Cao, B., Kang, Y., Lin, S., Luo, X., Xu, S., Lv, Z.: Style-sensitive 3D model retrieval through sketch-based queries. J. Intell. Fuzzy Syst. 31(5), 2637–2644 (2016)

    Article  Google Scholar 

  6. Chen, D., Tian, X., Shen, Y., Ouhyoung, M.: On visual similarity based 3D model retrieval. Comput. Graph. Forum 22(3), 223–232 (2003)

    Article  Google Scholar 

  7. Cheng, Z., Shen, J.: On very large scale test collection for landmark image search benchmarking. Sig. Process. 124, 13–26 (2016)

    Article  Google Scholar 

  8. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA, pp. 248–255 (2009)

    Google Scholar 

  9. Gao, Y., Dai, Q.: View-based 3D object retrieval: challenges and approaches. IEEE MultiMedia 21(3), 52–57 (2014)

    Article  Google Scholar 

  10. Gao, Y., Tang, J., Hong, R., Yan, S., Dai, Q., Zhang, N., Chua, T.: Camera constraint-free view-based 3-D object retrieval. IEEE Trans. Image Process. 21(4), 2269–2281 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  11. Gao, Y., Wang, M., Tao, D., Ji, R., Dai, Q.: 3-D object retrieval and recognition with hypergraph analysis. IEEE Trans. Image Process. 21(9), 4290–4303 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  12. Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A.C., Bengio, Y.: Maxout networks. In: Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, pp. 1319–1327 (2013)

    Google Scholar 

  13. Hong, R., Hu, Z., Wang, R., Wang, M., Tao, D.: Multi-view object retrieval via multi-scale topic models. IEEE Trans. Image Process. 25(12), 5814–5827 (2016)

    Article  MathSciNet  Google Scholar 

  14. Hong, R., Yang, Y., Wang, M., Hua, X.: Learning visual semantic relationships for efficient visual retrieval. IEEE Trans. Big Data 1(4), 152–161 (2015)

    Article  Google Scholar 

  15. Hu, F., Xia, G., Hu, J., Zhang, L.: Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. Remote Sens. 7(11), 14680–14707 (2015)

    Article  Google Scholar 

  16. Irfanoglu, M.O., Gökberk, B., Akarun, L.: 3D shape-based face recognition using automatically registered facial surfaces. In: 17th International Conference on Pattern Recognition, ICPR 2004, Cambridge, UK, 23–26 August 2004, pp. 183–186 (2004)

    Google Scholar 

  17. Kalogerakis, E., Averkiou, M., Maji, S., Chaudhuri, S.: 3D shape segmentation with projective convolutional networks. CoRR abs/1612.02808 (2016)

    Google Scholar 

  18. Kazhdan, M.M., Funkhouser, T.A., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3D shape descriptors. In: First Eurographics Symposium on Geometry Processing, Aachen, Germany, 23–25 June 2003, pp. 156–164 (2003)

    Google Scholar 

  19. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, Proceedings of a Meeting Held 3–6 December 2012, Lake Tahoe, Nevada, USA, pp. 1106–1114 (2012)

    Google Scholar 

  20. LeCun, Y., Haffner, P., Bottou, L., Bengio, Y.: Object recognition with gradient-based learning. In: Forsyth, D.A., Mundy, J.L., di Gesú, V., Cipolla, R. (eds.) Shape, Contour and Grouping in Computer Vision. LNCS, vol. 1681, pp. 319–345. Springer, Heidelberg (1999). https://doi.org/10.1007/3-540-46805-6_19

    Chapter  Google Scholar 

  21. LeCun, Y., Huang, F.J., Bottou, L.: Learning methods for generic object recognition with invariance to pose and lighting. In: 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June–2 July 2004, Washington, DC, USA, pp. 97–104 (2004)

    Google Scholar 

  22. Liu, A.A., Nie, W.Z., Gao, Y., Su, Y.T.: View-based 3-D model retrieval: a benchmark. IEEE Trans. Cybern. 48(3), 916–928 (2017)

    Google Scholar 

  23. Liu, A., Nie, W., Gao, Y., Su, Y.: Multi-modal clique-graph matching for view-based 3D model retrieval. IEEE Trans. Image Process. 25(5), 2103–2116 (2016)

    Article  MathSciNet  Google Scholar 

  24. Liu, A., Wang, Z., Nie, W., Su, Y.: Graph-based characteristic view set extraction and matching for 3D model retrieval. Inf. Sci. 320, 429–442 (2015)

    Article  Google Scholar 

  25. Liu, Q.: A survey of recent view-based 3D model retrieval methods. CoRR abs/1208.3670 (2012)

    Google Scholar 

  26. Maturana, D., Scherer, S.: VoxNet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2015, Hamburg, Germany, 28 September–2 October 2015, pp. 922–928 (2015)

    Google Scholar 

  27. Nie, L., Wang, M., Zha, Z.J., Chua, T.S.: Oracle in image search: a content-based approach to performance prediction. ACM Trans. Inf. Syst. 30(2), 13:1–13:23 (2012)

    Article  Google Scholar 

  28. Nie, L., Wang, M., Zha, Z., Li, G., Chua, T.S.: Multimedia answering: enriching text QA with media information. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, pp. 695–704. ACM (2011)

    Google Scholar 

  29. Nie, L., Yan, S., Wang, M., Hong, R., Chua, T.S.: Harvesting visual concepts for image search with complex queries. In: Proceedings of the 20th ACM International Conference on Multimedia, MM 2012, pp. 59–68. ACM (2012)

    Google Scholar 

  30. Saupe, D., Vranic, D.V.: 3D model retrieval with spherical harmonics and moments. In: Proceedings of the 23rd DAGM-Symposium Pattern Recognition, Munich, Germany, 12–14 September 2001, pp. 392–397 (2001)

    Google Scholar 

  31. Shi, B., Bai, S., Zhou, Z., Bai, X.: DeepPano: deep panoramic representation for 3-D shape recognition. IEEE Signal Process. Lett. 22(12), 2339–2343 (2015)

    Article  Google Scholar 

  32. Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3D shape recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 945–953 (2015)

    Google Scholar 

  33. Tangelder, J.W.H., Veltkamp, R.C.: Polyhedral model retrieval using weighted point sets. Int. J. Image Graph. 3(1), 209 (2003)

    Article  Google Scholar 

  34. Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1912–1920 (2015)

    Google Scholar 

  35. Xie, J., Dai, G., Zhu, F., Wong, E.K., Fang, Y.: Deepshape: deep-learned shape descriptor for 3D shape retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 39(7), 1335–1345 (2017)

    Article  Google Scholar 

  36. Xu, X., Corrigan, D., Dehghani, A., Caulfield, S., Moloney, D.: 3D object recognition based on volumetric representation using convolutional neural networks. In: Perales, F.J.J., Kittler, J. (eds.) AMDO 2016. LNCS, vol. 9756, pp. 147–156. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41778-3_15

    Chapter  Google Scholar 

  37. Yang, S., Ramanan, D.: Multi-scale recognition with DAG-CNNs. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 1215–1223 (2015)

    Google Scholar 

  38. Zhang, H., Shang, X., Luan, H., Wang, M., Chua, T.: Learning from collective intelligence: feature learning using social images and tags. TOMCCAP 13(1), 1:1–1:23 (2016)

    Article  Google Scholar 

  39. Zhang, H., Shang, X., Yang, W., Xu, H., Luan, H., Chua, T.: Online collaborative learning for open-vocabulary visual classifiers. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 2809–2817 (2016)

    Google Scholar 

  40. Zhang, H., Shen, F., Liu, W., He, X., Luan, H., Chua, T.: Discrete collaborative filtering. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Italy, 17–21 July 2016, pp. 325–334 (2016)

    Google Scholar 

  41. Zhang, H., Zha, Z., Yang, Y., Yan, S., Chua, T.: Robust (semi) nonnegative graph embedding. IEEE Trans. Image Process. 23(7), 2996–3012 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  42. Zhang, H., Zha, Z., Yang, Y., Yan, S., Gao, Y., Chua, T.: Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In: ACM Multimedia Conference, MM 2013, Barcelona, Spain, 21–25 October 2013, pp. 33–42 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weizhi Nie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, A., Xiang, S., Nie, W., Su, Y. (2018). Multi-layers CNNs for 3D Model Retrieval. In: Huet, B., Nie, L., Hong, R. (eds) Internet Multimedia Computing and Service. ICIMCS 2017. Communications in Computer and Information Science, vol 819. Springer, Singapore. https://doi.org/10.1007/978-981-10-8530-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-8530-7_36

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-8529-1

  • Online ISBN: 978-981-10-8530-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics