Abstract
Automated sperm sample analysis using computer vision techniques has gained increasing interest due to the tedious and time-consuming nature of manual evaluation. Deep learning models have been applied for sperm detection, tracking, motility analysis, and morphology recognition. However, the lack of labeled data hinders their adoption in laboratories. In this work, we propose a method to generate synthetic spermatozoa video sequences using Generative Adversarial Imitation Learning (GAIL). Our approach uses a parametric model based on Bezier splines to generate frames of a single spermatozoon. We evaluate our method against U-net and GAN-based approaches, and demonstrate its superior performance.
This research work has been supported by project TED2021-129162B-C22, funded by the Recovery and Resilience Facility program from the NextGenerationEU and the Spanish Research Agency (Agencia Estatal de Investigación); and PID2021-128362OB-I00, funded by the Spanish Plan for Scientific and Technical Research and Innovation of the Spanish Research Agency.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arora, S., Doshi, P.: A survey of inverse reinforcement learning: challenges, methods and progress. Artif. Intell. 297, 103500 (2021). https://doi.org/10.1016/j.artint.2021.103500
Balayev, K., Guluzade, N., Aygün, S., İlhan, H.O.: The implementation of DCGAN in the data augmentation for the sperm morphology datasets. Avrupa Bilim ve Teknoloji Dergisi 26, 307–314 (2021)
Bhattacharyya, R., et al.: Modeling human driving behavior through generative adversarial imitation learning. IEEE Trans. Intell. Transp. Syst. 24(3), 2874–2887 (2023). https://doi.org/10.1109/TITS.2022.3227738
Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: Infogan: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Choi, S., Kim, J., Yeo, H.: Trajgail: generating urban vehicle trajectories using generative adversarial imitation learning. Transp. Res. Part C: Emerg. Technol. 128, 103091 (2021). https://doi.org/10.1016/j.trc.2021.103091
Coates, A., Abbeel, P., Ng, A.Y.: Apprenticeship learning for helicopter control. Commun. ACM 52(7), 97–105 (2009)
Dai, C., et al.: Advances in sperm analysis: techniques, discoveries and applications. Nat. Rev. Urology 18(8), 447–467 (2021)
Gauthier, J.: Conditional generative adversarial nets for convolutional face generation. Class Project Stanford CS231N: Convolutional Neural Netw. Vis. Recogn. Winter Semester 2014(5), 2 (2014)
Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 27 (2014)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Hernández-Ferrándiz, D., Pantrigo, J.J., Cabido, R.: SCASA: from synthetic to real computer-aided sperm analysis. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Adeli, H. (eds.) Bio-inspired Systems and Applications: from Robotics to Ambient Intelligence, pp. 233–242. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-031-06527-9_23
Hidayatullah, P., Mengko, T., Munir, R., Barlian, A.: A semiautomatic dataset generator for convolutional neural network. In: Proceedings of the International Conference on Electrical Engineering & Computer Science (ICEECS 2018), pp. 17–21 (2018)
Hidayatullah, P., Mengko, T.L.E.R., Munir, R., Barlian, A.: Bull sperm tracking and machine learning-based motility classification. IEEE Access 9, 61159–61170 (2021). https://doi.org/10.1109/ACCESS.2021.3074127
Hidayatullah, P., et al.: Deepsperm: a robust and real-time bull sperm-cell detection in densely populated semen videos. Comput. Methods Programs Biomed. 209, 106302 (2021)
Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Kim, B., Pineau, J.: Socially adaptive path planning in human environments using inverse reinforcement learning. Int. J. Soc. Rob. 8, 51–66 (2016)
Kretzschmar, H., Spies, M., Sprunk, C., Burgard, W.: Socially compliant mobile robot navigation via inverse reinforcement learning. Int. J. Rob. Res. 35(11), 1289–1307 (2016). https://doi.org/10.1177/0278364915619772
Mirza, M., Osindero, S.: Conditional generative adversarial nets (2014). https://doi.org/10.48550/ARXIV.1411.1784
Paul, D., Tewari, A., Jeong, J., Banerjee, I.: Boosting classification accuracy of fertile sperm cell images leveraging cdcgan. In: ICLR, 2021 (2021)
Rafailov, R., Yu, T., Rajeswaran, A., Finn, C.: Visual adversarial imitation learning using variational models. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 3016–3028. Curran Associates, Inc. (2021)
Riordon, J., McCallum, C., Sinton, D.: Deep learning for the classification of human sperm. Comput. Biol. Med. 111, 103342 (2019). https://doi.org/10.1016/j.compbiomed.2019.103342
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Rubin, M., et al.: TOP-GAN: stain-free cancer cell classification using deep learning with a small training set. Med. Image Anal. 57, 176–185 (2019). https://doi.org/10.1016/j.media.2019.06.014
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sun, M., Devlin, S., Hofmann, K., Whiteson, S.: Deterministic and discriminative imitation (d2-imitation): revisiting adversarial imitation for sample efficiency. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 8378–8385 (2022)
Wu, E., Wu, K., Cox, D., Lotter, W.: Conditional infilling GANs for data augmentation in mammogram classification. In: Stoyanov, D., et al. (eds.) RAMBO/BIA/TIA -2018. LNCS, vol. 11040, pp. 98–106. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00946-5_11
Xiao, H., Herman, M., Wagner, J., Ziesche, S., Etesami, J., Linh, T.H.: Wasserstein adversarial imitation learning. arXiv preprint arXiv:1906.08113 (2019)
Yüzkat, M., Ilhan, H.O., Aydin, N.: Multi-model CNN fusion for sperm morphology analysis. Comput. Biol. Med. 137, 104790 (2021). https://doi.org/10.1016/j.compbiomed.2021.104790
Zeng, Q., Ma, X., Cheng, B., Zhou, E., Pang, W.: GANs-based data augmentation for citrus disease severity detection using deep learning. IEEE Access 8, 172882–172891 (2020). https://doi.org/10.1109/ACCESS.2020.3025196
Ziebart, B.D., Maas, A.L., Bagnell, J.A., Dey, A.K., et al.: Maximum entropy inverse reinforcement learning. In: AAAI, Chicago, IL, USA, vol. 8, pp. 1433–1438 (2008)
Ziebart, B.D., Maas, A.L., Dey, A.K., Bagnell, J.A.: Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 322–331 (2008)
Zuo, G., Chen, K., Lu, J., Huang, X.: Deterministic generative adversarial imitation learning. Neurocomputing 388, 60–69 (2020). https://doi.org/10.1016/j.neucom.2020.01.016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
In this section we detail the network architectures
GAIL. We use a PPO agent as a generator and a CCN as a discriminator.
-
PPO is an actor-critic RL agent that comprises two neural networks referred to as the actor and the critic. Both networks shared the same backbone, consisting of two convolutional layers alternated with max pooling layers, followed by flatten and passed through a hidden dense layer with ReLU activation. The actor network estimates the policy \(\pi _\theta \) so its output layer comprises five dense neurons with tanh activation. The critic network evaluates the actor’s decisions, so its output layer just has one neuron with linear activation.
-
Discriminator is similar to the critic but with a sigmoid activation at the output. To prevent overfitting we incorporate dropout after each convolutional and dense layer.
WGAN. It has the conventional architecture.
-
Generator has an input layer of size \(100\times 1\) followed by three transposed convolution layers. We alternate these layers with batch normalization and ReLU activation.
-
Discriminator has an input layer of size \(28\times 28\). Then, it has the same architecture as GAIL’s discriminator but with three convolutional layers with a single dense neuron with linear activation at the output.
U-Net. It receives an input tensor of size \(5\times 28\times 28\) corresponding to the five preceding frames, \([s_{t-4},\ldots , s_t]\). Then, it compresses the input tensor using four convolutions to a size of \(64\!\times 4\!\times 4\). The image reconstruction is made by three successive transposed convolutions, resulting in an output tensor of size \(1\times \!28\!\times 28\), corresponding to the next frame \(s_{t+1}\). All hidden activations are ReLU.
CWGAN. We use the same U-Net architecture for the generator and a CNN-based discriminator.
-
Generator produces an image for \(s_{t+1}\) conditioned on the input in \(s_t\).
-
Discriminator has an input layer of \(6\!\times 28\!\times 28\), corresponding to the five preceding frames concatenated with the one predicted by the generator. The architecture is the same than WGAN’s Discriminator.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hernández-García, S., Cuesta-Infante, A., Montemayor, A.S. (2023). Synthetic Spermatozoa Video Sequences Generation Using Adversarial Imitation Learning. In: Pertusa, A., Gallego, A.J., Sánchez, J.A., Domingues, I. (eds) Pattern Recognition and Image Analysis. IbPRIA 2023. Lecture Notes in Computer Science, vol 14062. Springer, Cham. https://doi.org/10.1007/978-3-031-36616-1_45
Download citation
DOI: https://doi.org/10.1007/978-3-031-36616-1_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36615-4
Online ISBN: 978-3-031-36616-1
eBook Packages: Computer ScienceComputer Science (R0)