Abstract
Generative models based on deep neural networks often have a high-dimensional latent space, ranging sometimes to a few hundred dimensions or even higher, which typically makes them hard for a user to explore directly. We propose differential subspace search to allow efficient iterative user exploration in such a space, without relying on domain- or data-specific assumptions. We develop a general framework to extract low-dimensional subspaces based on a local differential analysis of the generative model, such that a small change in such a subspace would provide enough change in the resulting data. We do so by applying singular value decomposition to the Jacobian of the generative model and forming a subspace with the desired dimensionality spanned by a given number of singular vectors stochastically selected on the basis of their singular values, to maintain ergodicity. We use our framework to present 1D subspaces to the user via a 1D slider interface. Starting from an initial location, the user finds a new candidate in the presented 1D subspace, which is in turn updated at the new candidate location. This process is repeated until no further improvement can be made. Numerical simulations show that our method can better optimize synthetic black-box objective functions than the alternatives that we tested. Furthermore, we conducted a user study using complex generative models and the results show that our method enables more efficient exploration of high-dimensional latent spaces than the alternatives.
Supplemental Material
- Adobe. 2017. Using the Brainstorming tool in After Effects CS6. Retrieved April 19, 2020 from https://helpx.adobe.com/after-effects/atv/cs6-tutorials/brainstorming.html.Google Scholar
- David Bau, Jun-Yan Zhu, Hendrik Strobelt, Bolei Zhou, Joshua B. Tenenbaum, William T. Freeman, and Antonio Torralba. 2019. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. In Proc. of ICLR 2019. https://openreview.net/forum?id=Hyg_X2C5FXGoogle Scholar
- Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence 35, 8 (2013), 1798--1828. Google ScholarDigital Library
- Eric Brochu, Nando de Freitas, and Abhijeet Ghosh. 2007. Active Preference Learning with Discrete Choice Data. In Advances in Neural Information Processing Systems 20 (NIPS 2007). 409--416. https://dl.acm.org/doi/10.5555/2981562.2981614Google Scholar
- Andrew Brock, Theodore Lim, James M Ritchie, and Nick Weston. 2017. Neural Photo Editing with Introspective Adversarial Networks. In Proc. of ICLR 2017. https://openreview.net/forum?id=HkNKFiGexGoogle Scholar
- Emmanuel J. Candès and Benjamin Recht. 2009. Exact Matrix Completion via Convex Optimization. Foundations of Computational Mathematics 9, 6 (2009), 717--772. Google ScholarDigital Library
- Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. 2015. ShapeNet: An Information-Rich 3D Model Repository. https://arxiv.org/abs/1512.03012Google Scholar
- Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. In Advances in Neural Information Processing Systems 29 (NIPS 2016). 2172--2180. https://dl.acm.org/doi/abs/10.5555/3157096.3157340Google Scholar
- Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proc. of CVPR 2019. 5939--5948. Google ScholarCross Ref
- Chris Donahue, Julian McAuley, and Miller Puckette. 2018. Adversarial Audio Synthesis. https://arxiv.org/abs/1802.04208Google Scholar
- Jesse Engel, Kumar Krishna Agrawal, Shuo Chen, Ishaan Gulrajani, Chris Donahue, and Adam Roberts. 2019. GANSynth: Adversarial Neural Audio Synthesis. In Proc. of ICLR 2019. https://openreview.net/forum?id=H1xQVn09FXGoogle Scholar
- Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Mohammad Norouzi, Douglas Eck, and Karen Simonyan. 2017. Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders. In Proc. of Machine Learning Research - Volume 70 (ICML 2017). JMLR.org, 1068--1077. https://dl.acm.org/doi/abs/10.5555/3305381.3305492Google Scholar
- Lore Goetschalckx, Alex Andonian, Aude Oliva, and Phillip Isola. 2019. GANalyze: Toward Visual Definitions of Cognitive Image Properties. In Proc. of ICCV 2019. 5744--5753.Google ScholarCross Ref
- Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. 2017. Improved Training of Wasserstein GANs. In Advances in Neural Information Processing Systems 30 (NIPS 2017). 5769--5779. https://dl.acm.org/doi/abs/10.5555/3295222.3295327Google Scholar
- Toby Chong Long Hin, I-Chao Shen, Issei Sato, and Takeo Igarashi. 2019. Interactive Subspace Exploration on Generative Image Modelling. https://arxiv.org/abs/1906.09840Google Scholar
- Ali Jahanian, Lucy Chai, and Phillip Isola. 2020. On the "steerability" of generative adversarial networks. In Proc. of ICLR 2020. https://openreview.net/forum?id=HylsTT4FvBGoogle Scholar
- Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In Proc. of ICLR 2018. https://openreview.net/forum?id=Hk99zCeAbGoogle Scholar
- Diederik P. Kingma, Danilo J. Rezende, Shakir Mohamed, and Max Welling. 2014. Semi-Supervised Learning with Deep Generative Models. In Advances in Neural Information Processing Systems 27 (NIPS 2014). 3581--3589. https://dl.acm.org/doi/10.5555/2969033.2969226Google Scholar
- Yuki Koyama, Daisuke Sakamoto, and Takeo Igarashi. 2014. Crowd-Powered Parameter Analysis for Visual Design Exploration. In Proc. of UIST 2014. 65--74. Google ScholarDigital Library
- Yuki Koyama, Issei Sato, Daisuke Sakamoto, and Takeo Igarashi. 2017. Sequential Line Search for Efficient Visual Design Optimization by Crowds. ACM Transactions on Graphics 36, 4 (Proc. of SIGGRAPH 2017) (July 2017), 48:1--48:11. Google ScholarDigital Library
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324. Google ScholarCross Ref
- Norbert Lindow, Daniel Baum, and Hans-Christian Hege. 2012. Perceptually Linear Parameter Variations. Computer Graphics Forum 31, 2--4 (Proc. of EUROGRAPHICS 2012) (May 2012), 535--544. Google ScholarDigital Library
- Seppo Linnainmaa. 1970. The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master's thesis. University of Helsinki, Finland.Google Scholar
- Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proc. of ICCV 2015. 3730--3738. Google ScholarDigital Library
- Joe Marks, Brad Andalman, Paul A. Beardsley, William T. Freeman, Sarah F. Frisken-Gibson, Jessica K. Hodgins, Thomas Kang, Brian V. Mirtich, Hanspeter Pfister, Wheeler Ruml, Kathy Ryall, Joshua E. Seims, and Stuart M. Shieber. 1997. Design Galleries: A General Approach to Setting Parameters for Computer Graphics and Animation. In Proc. of SIGGRAPH '97. 389--400. Google ScholarDigital Library
- Lucas Maystre. 2018. Efficient Learning from Comparisons. Ph.D. Dissertation. École Polytechnique Fédérale de Lausanne.Google Scholar
- Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial Nets. https://arxiv.org/abs/1411.1784Google Scholar
- Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. 2018. Spectral Normalization for Generative Adversarial Networks. In Proc. of ICLR 2018. https://openreview.net/forum?id=B1QRgziT-Google Scholar
- Riccardo Moriconi, Marc P. Deisenroth, and K. S. Sesh Kumar. 2019. High-dimensional Bayesian optimization using low-dimensional feature spaces. https://arxiv.org/abs/1902.10675Google Scholar
- Yurii Nesterov. 2018. Lectures on Convex Optimization. Springer. Google ScholarCross Ref
- Carl Edward Rasmussen and Christopher K. I. Williams. 2006. Gaussian Processes for Machine Learning. The MIT Press.Google ScholarDigital Library
- Salah Rifai, Yann N. Dauphin, Pascal Vincent, Yoshua Bengio, and Xavier Muller. 2011. The Manifold Tangent Classifier. In Advances in Neural Information Processing Systems 24 (NIPS 2011). 2294--2302. https://dl.acm.org/doi/10.5555/2986459.2986715Google Scholar
- Jürgen Schmidhuber. 2015. Deep learning in neural networks: An overview. Neural Networks 61 (2015), 85--117. Google ScholarDigital Library
- Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando de Freitas. 2015. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE 104, 1 (2015), 148--175. Google ScholarCross Ref
- Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the Latent Space of GANs for Semantic Face Editing. In Proc. of CVPR 2020. To appear.Google ScholarCross Ref
- Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems 25 (NIPS 2012). 2951--2959. https://dl.acm.org/doi/abs/10.5555/2999325.2999464Google ScholarDigital Library
- Hideyuki Takagi. 2001. Interactive Evolutionary Computation: Fusion of the Capabilities of EC Optimization and Human Evaluation. Proceedings of the IEEE 89, 9 (Sep. 2001), 1275--1296. Google ScholarCross Ref
- Jerry O. Talton, Daniel Gibson, Lingfeng Yang, Pat Hanrahan, and Vladlen Koltun. 2009. Exploratory Modeling with Collaborative Design Spaces. ACM Transactions on Graphics 28, 5 (Proc. of SIGGRAPH Asia 2009) (Dec. 2009), 167:1--167:10. Google ScholarDigital Library
- Nobuyuki Umetani. 2017. Exploring Generative 3D Shapes Using Autoencoder Networks. In SIGGRAPH Asia 2017 Technical Briefs. 24:1--24:4. Google ScholarDigital Library
- Ziyu Wang, Frank Hutter, Masrour Zoghi, David Matheson, and Nando de Freitas. 2016. Bayesian Optimization in a Billion Dimensions via Random Embeddings. Journal of Artificial Intelligence Research 55, 1 (February 2016), 361--387. Google ScholarCross Ref
- Ceyuan Yang, Yujun Shen, and Bolei Zhou. 2019. Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis. https://arxiv.org/abs/1911.09267Google Scholar
- Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, and Alexei A. Efros. 2016. Generative Visual Manipulation on the Natural Image Manifold. In Computer Vision - ECCV 2016. 597--613. Google ScholarCross Ref
Index Terms
- Human-in-the-loop differential subspace search in high-dimensional latent space
Recommendations
Constrained discriminant neighborhood embedding for high dimensional data feature extraction
When handling pattern classification problem such as face recognition and digital handwriting identification, image data is always represented to high dimensional vectors, from which discriminant features are extracted using dimensionality reduction ...
Local Tangent Space Discriminant Analysis
We propose a novel supervised dimensionality reduction method named local tangent space discriminant analysis (TSD) which is capable of utilizing the geometrical information from tangent spaces. The proposed method aims to seek an embedding space where ...
Simultaneous multiple low-dimensional subspace dimensionality reduction and classification
Fisher linear discriminant (FLD) for supervised learning has recently emerged as a computationally powerful tool for extracting features for a variety of pattern classification problems. However, it works poorly with multimodal data. Local Fisher linear ...
Comments