Abstract
Hyperspectral cameras generate a large amount of data due to the presence of hundreds of spectral bands as opposed to only three channels (red, green, and blue) in traditional cameras. This requires a significant amount of data transmission between the hyperspectral image sensor and a processor used to classify/detect/track the images, frame by frame, expending high energy and causing bandwidth and security bottlenecks. To mitigate this problem, we propose a form of processing-in-pixel (PIP) that leverages advanced CMOS technologies to enable the pixel array to perform a wide range of complex operations required by the modern convolutional neural networks (CNN) for hyperspectral image (HSI) recognition. Consequently, our PIP-optimized custom CNN layers effectively compress the input data, significantly reducing the bandwidth required to transmit the data downstream to the HSI processing unit. This reduces the average energy consumption associated with pixel array of cameras and the CNN processing unit by \(25.06\times \) and \(3.90\times \) respectively, compared to existing hardware implementations. Our experimental results yield reduction of data rates after the sensor ADCs by up to \({\sim }10\times \), significantly reducing the complexity of downstream processing. Our custom models yield average test accuracies within \(0.56\%\) of the baseline models for the standard HSI benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The weights can also be programmable by mapping to emerging resistive non-volatile memory elements embedded within individual pixels.
- 2.
The pixel array energy is equal to the image read-out energy for the baseline models and in-pixel convolution energy for custom models.
- 3.
The energy model for 2D convolutional layers can be extended to linear layers with \(k=H_l^o=W_l^o=1\) and \(C_l^i\) and \(C_l^o\) as the number of input and output neurons respectively.
References
Alhamzi, K., et al.: 3D object recognition based on image features: a survey. Int. J. Comput. Inf. Technol. (IJCIT) 3, 651–660 (2014)
Lv, Z., et al.: Real-time image processing for augmented reality on mobile devices. J. Real-Time Image Process. 18, 245–248 (2021)
Facciolo, G., et al.: Automatic 3D reconstruction from multi-date satellite images. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). vol. 1, 1542–1551 (2017)
Chen, Y., et al.: Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Observations Remote Sens. 7(6), 2094–2107 (2014)
Zheng, Z., et al.: FPGA: fast patch-free global learning framework for fully end-to-end hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 58(8), 5612–5626 (2020)
Roy, S.K., et al.: HybridSN: exploring 3-D-2-D CNN feature hierarchy for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 17(2), 277–281 (2020)
Luo, Y., et al.: HSI-CNN: a novel convolution neural network for hyperspectral image. In: 2018 International Conference on Audio, Language and Image Processing (ICALIP), vol. 1, pp. 464–469 (2018)
Krizhevsky, A., et al.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25 (2012)
Li, D., et al.: Evaluating the energy efficiency of deep convolutional neural networks on CPUs and GPUs. In: 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), vol. 1, pp. 477–484 (2016)
Chen, Z., et al.: Processing near sensor architecture in mixed-signal domain with CMOS image sensor of convolutional-kernel-readout method. IEEE Trans. Circuits Syst. I Regul. Pap. 67(2), 389–400 (2020)
Mennel, L., et al.: Ultrafast machine vision with 2D material neural network image sensors. Nature 579, 62–66 (2020)
Bose, L., Dudek, P., Chen, J., Carey, S.J., Mayol-Cuevas, W.W.: Fully embedding fast convolutional networks on pixel processor arrays. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 488–503. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_29
Datta, G., et al.: P\(^2\)M: a processing in- pixel in- memory paradigm for resource-constrained TinyML applications. arXiv preprint arXiv:2203.04737 (2022)
Meng, Z., et al.: Deep residual involution network for hyperspectral image classification. Remote Sens. 13(16) (2021)
He, K., et al.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
Kodukula, V., et al.: Dynamic temperature management of near-sensor processing for energy-efficient high-fidelity imaging. Sensors 1(3) (2021)
Sony to release world’s first intelligent vision sensors with AI processing functionality (2020). https://www.sony.com/en/SonyInfo/News/Press/202005/20-037E/. Accessed 12 Jan 2022
Chi, P., et al.: PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), vol. 1, pp. 27–39 (2016)
Song, R., Huang, K., Wang, Z., Shen, H.: A reconfigurable convolution-in-pixel CMOS image sensor architecture. IEEE Trans. Circuits Syst. Video Technol. 32, 7212–7225 (2022)
Adão, T., et al.: Hyperspectral imaging: a review on UAV-based sensors, data processing and applications for agriculture and forestry. Remote Sens. 9(11), 1110 (2017)
Hagen, N., Kudenov, M.: Review of snapshot spectral imaging technologies. Opt. Eng. 52, 090901 (2013)
Imec introduces new snapshot hyperspectral image sensors with mosaic filter architecture (2014). https://phys.org/news/2015-02-imec-snapshot-hyperspectral-image-sensors.html. Accessed 12 Feb 2014
Alipour-Fard, T., Paoletti, M.E., Haut, J.M., Arefi, H., Plaza, J., Plaza, A.: Multibranch selective kernel networks for hyperspectral image classification. IEEE Geosci. Remote Sens. Lett. 1(1), 1–5 (2020)
Song, W., et al.: Hyperspectral image classification with deep feature fusion network. IEEE Trans. Geosci. Remote Sens. 56(6), 3173–3184 (2018)
Ben Hamida, A., et al.: 3-D deep learning approach for remote sensing image classification. IEEE Trans. Geosci. Remote Sens. 56(8), 4420–4434 (2018)
Hahn, R., et al.: Detailed characterization of a mosaic based hyperspectral snapshot imager. Opt. Eng. 59(12), 125102 (2020)
Lodhi, V., Chakravarty, D., Mitra, P.: Hyperspectral imaging system: development aspects and recent trends. Sens. Imag. 20(1), 1–24 (2019)
Gonzalez, P., Geelen, B., Blanch, C., Tack, K., Lambrechts, A.: A CMOS-compatible, monolithically integrated snapshot-mosaic multispectral imager. NIR News 26(4), 6–11 (2015)
Datta, G., et al.: P2M-DeTrack: processing-in-pixel-in-memory for energy-efficient and real-time multi-object detection and tracking. arXiv preprint arXiv:2205.14285 (2022)
Snapshot mosaic hyperspectral imaging camera (2020). http://image-sensors-world.blogspot.com/2016/02/imec-introduces-broad-spectrum.html. Accessed 12 Jan 2020
Courbariaux, M., et al.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or \(-\)1. arXiv preprint arXiv:1602.02830 (2016)
Datta, G., et al.: HYPER-SNN: towards energy-efficient quantized deep spiking neural networks for hyperspectral image classification. arXiv preprint arXiv:2107.11979 (2021)
ON Semiconductor: CMOS Image Sensor, 1.2 MP, Global Shutter. (3 220) Rev. 10
Zhong, Z., et al.: Spectral-spatial residual network for hyperspectral image classification: a 3-D deep learning framework. IEEE Trans. Geosci. Remote Sens. 56(2), 847–858 (2018)
Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. arXiv preprint arXiv:2107.12374 (2021)
Datta, G., et al.: Can deep neural networks be converted to ultra low-latency spiking neural networks? (2021)
Kundu, S., Datta, G., Pedram, M., Beerel, P.A.: Spike-thrift: towards energy-efficient deep spiking neural networks by limiting spiking activity via attention-guided compression. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 3953–3962, January 2021
Chowdhery, A., et al.: Visual wake words dataset. arXiv preprint arXiv:1906.05721 (2019)
Gonugondla, S.K., et al.: Fundamental limits on energy-delay-accuracy of in-memory architectures in inference applications. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41, 3188–3201 (2021)
Kang, M., et al.: An in-memory VLSI architecture for convolutional neural networks. IEEE J. Emerging Sel. Top. Circuits Syst. 8(3), 494–505 (2018)
Acknowledgements
We would like to acknowledge the DARPA HR00112190120 award for supporting this work. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Datta, G., Yin, Z., Jacob, A.P., Jaiswal, A.R., Beerel, P.A. (2023). Towards Energy-Efficient Hyperspectral Image Processing Inside Camera Pixels. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_22
Download citation
DOI: https://doi.org/10.1007/978-3-031-25075-0_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)