tutorial

Deep Bayesian Mining, Learning and Understanding

Author:
Jen-Tzung Chien

National Chiao Tung University, Hsinchu, Taiwan Roc

National Chiao Tung University, Hsinchu, Taiwan Roc
View Profile

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningJuly 2019Pages 3197–3198https://doi.org/10.1145/3292500.3332267

Published:25 July 2019Publication History

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 3197–3198

ABSTRACT

This tutorial addresses the advances in deep Bayesian mining and learning for natural language with ubiquitous applications ranging from speech recognition to document summarization, text classification, text segmentation, information extraction, image caption generation, sentence generation, dialogue control, sentiment classification, recommendation system, question answering and machine translation, to name a few. Traditionally, "deep learning" is taken to be a learning process where the inference or optimization is based on the real-valued deterministic model. The "semantic structure" in words, sentences, entities, actions and documents drawn from a large vocabulary may not be well expressed or correctly optimized in mathematical logic or computer programs. The "distribution function" in discrete or continuous latent variable model for natural language may not be properly decomposed or estimated. This tutorial addresses the fundamentals of statistical models and neural networks, and focus on a series of advanced Bayesian models and deep models including hierarchical Dirichlet process, Chinese restaurant process, hierarchical Pitman-Yor process, Indian buffet process, recurrent neural network, long short-term memory, sequence-to-sequence model, variational auto-encoder, generative adversarial network, attention mechanism, memory-augmented neural network, skip neural network, stochastic neural network, predictive state neural network, policy neural network. We present how these models are connected and why they work for a variety of applications on symbolic and complex patterns in natural language. The variational inference and sampling method are formulated to tackle the optimization for complicated models. The word and sentence embeddings, clustering and co-clustering are merged with linguistic and semantic constraints. A series of case studies are presented to tackle different issues in deep Bayesian mining, learning and understanding. At last, we will point out a number of directions and outlooks for future studies.

Supplemental Material

p3197-chien_part1.mp4

mp4

5.1 GB

Download

p3197-chien_part2.mp4

mp4

6.8 GB

Download

References

Jen-Tzung Chien. 2015a. Hierarchical Pitman-Yor-Dirichlet Language Model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 23, 8 (2015), 1259--1272.Google ScholarDigital Library
Jen-Tzung Chien. 2015b. Laplace Group Sensing for Acoustic Models. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 23, 5 (2015), 909--922.Google ScholarDigital Library
Jen-Tzung Chien. 2016. Hierarchical Theme and Topic Modeling. IEEE Transactions on Neural Networks and Learning Systems, Vol. 27, 3 (2016), 565--578.Google ScholarCross Ref
Jen-Tzung Chien. 2018. Bayesian Nonparametric Learning for Hierarchical and Sparse Topics. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 26, 2 (2018), 422--435. Google ScholarDigital Library
Jen-Tzung Chien and Ying-Lan Chang. 2014. Bayesian Sparse Topic Model. Journal of Signal Processing Systems , Vol. 74, 3 (2014), 375--389. Google ScholarDigital Library
Jen-Tzung Chien and Chuang-Hua Chueh. 2011. Dirichlet Class Language Models for Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing , Vol. 19, 3 (2011), 482--495. Google ScholarDigital Library
Jen-Tzung Chien and Chuang-Hua Chueh. 2012. Topic-Based Hierarchical Segmentation. IEEE Transactions on Audio, Speech, and Language Processing , Vol. 20, 1 (2012), 55--66. Google ScholarDigital Library
Jen-Tzung Chien and Yuan-Chu Ku. 2016. Bayesian Recurrent Neural Network for Language Modeling. IEEE Transactions on Neural Networks and Learning Systems , Vol. 27, 2 (2016), 361--374.Google ScholarCross Ref
Jen-Tzung Chien and Kuan-Ting Kuo. 2017. Variational Recurrent Neural Networks for Speech Separation. In Proc. of Annual Conference of International Speech Communication Association . 1193--1197.Google ScholarCross Ref
Jen-Tzung Chien and Chao-Hsi Lee. 2018. Deep Unfolding for Topic Models. IEEE Transactions on Pattern Analysis and Machine Intelligence , Vol. 40, 2 (2018), 318--331. Google ScholarDigital Library
Jen-Tzung Chien and Ting-An Lin. 2018. Supportive Attention in End-To-End Memory Networks. In Proc. of IEEE International Workshop on Machine Learning for Signal Processing. 1--6.Google ScholarCross Ref
Jen-Tzung Chien and Chun-Wei Wang. 2019. Variational and Hierarchical Recurrent Autoencoder. In Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing . 3202--3206.Google ScholarCross Ref
Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua Bengio. 2015. A Recurrent Latent Variable Model for Sequential Data. In Advances in Neural Information Processing Systems. 2980--2988. Google ScholarDigital Library
Yann N. Dauphin, Angela Fan, Michael Auli, and David Grangier. 2017. Language Modeling with Gated Convolutional Networks. In Proc. of International Conference on Machine Learning. 933--941. Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
Marco Fraccaro, Sø ren Kaae Sø nderby, Ulrich Paquet, and Ole Winther. 2016. Sequential Neural Models with Stochastic Layers. In Advances in Neural Information Processing Systems. 2199--2207. Google ScholarDigital Library
Yarin Gal and Zoubin Ghahramani. 2016. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. In Advances in Neural Information Processing Systems. 1019--1027. Google ScholarDigital Library
Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N Dauphin. 2017. Convolutional Sequence to Sequence Learning. In Proc. of International Conference on Machine Learning. 1243--1252. Google ScholarDigital Library
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems. 2672--2680. Google ScholarDigital Library
Anirudh Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Ke, and Yoshua Bengio. 2017. Z-Forcing: Training Stochastic Recurrent Networks. In Advances in Neural Information Processing Systems 30. 6713--6723. Google ScholarDigital Library
Alex Graves, Santiago Fernández, Faustino Gomez, and Jürgen Schmidhuber. 2006. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks. In Proc. of International Conference on Machine Learning. 369--376. Google ScholarDigital Library
E. Jang, S. Gu, and B. Poole. 2017. Categorical Reparameterization with Gumbel-Softmax. In Proc. of International Conference on Learning Representations .Google Scholar
Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A Convolutional Neural Network for Modelling Sentences. In Proc. of Annual Meeting of the Association for Computational Linguistics. 655--665.Google ScholarCross Ref
Che-Yu Kuo and Jen-Tzung Chien. 2018. Markov Recurrent Neural Networks. In Proc. of IEEE International Workshop on Machine Learning for Signal Processing. 1--6.Google ScholarCross Ref
Isabeau Prémont-Schwarz, Alexander Ilin, Tele Hao, Antti Rasmus, Rinu Boney, and Harri Valpola. 2017. Recurrent Ladder Networks. In Advances in Neural Information Processing Systems. 6011--6021. Google ScholarDigital Library
Antti Rasmus, Mathias Berglund, Mikko Honkala, Harri Valpola, and Tapani Raiko. 2015. Semi-Supervised Learning with Ladder Networks. In Advances in Neural Information Processing Systems. 3546--3554. Google ScholarDigital Library
Iulian V. Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, and Yoshua Bengio. 2017. A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues. In Proc. of AAAI Conference on Artificial Intelligence. 3295--3301. Google ScholarDigital Library
Jen-Chieh Tsai and Jen-Tzung Chien. 2017. Adversarial Domain Separation and Adaptation. In Proc. of IEEE International Workshop on Machine Learning for Signal Processing. 1--6.Google ScholarCross Ref
Kai-Wei Tsou and Jen-Tzung Chien. 2017. Memory Augmented Neural Network for Source Separation. In Proc. of IEEE International Workshop on Machine Learning for Signal Processing. 1--6.Google ScholarCross Ref
Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. 2017. Neural Discrete Representation Learning. In Advances in Neural Information Processing Systems. 6309--6318. Google ScholarDigital Library
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems. 5998--6008. Google Scholar
Arun Venkatraman, Nicholas Rhinehart, Wen Sun, Lerrel Pinto, Martial Hebert, Byron Boots, Kris Kitani, and J Bagnell. 2017. Predictive-State Decoders: Encoding the Future into Recurrent Networks. In Advances in Neural Information Processing Systems. 1172--1183. Google ScholarDigital Library
Shinji Watanabe and Jen-Tzung Chien. 2015. Bayesian Speech and Language Processing .Cambridge University Press. Google ScholarDigital Library
Shi Xingjian, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-kin Wong, and Wang-chun Woo. 2015. Convolutional LS™ Network: A Machine Learning Approach for Precipitation Nowcasting. In Advances in Neural Information Processing Systems. 802--810. Google ScholarDigital Library
Dong Yu, Geoffrey Hinton, Nelson Morgan, Jen-Tzung Chien, and Shigeki Sagayama. 2011. Introduction to the Special Section on Deep Learning for Speech and Language Processing. IEEE Transactions on Audio, Speech, and Language Processing , Vol. 20, 1 (2011), 4--6. Google ScholarDigital Library
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. In Prof. of AAAI Conference on Artificial Intelligence, Vol. 31. 2852--2858. Google ScholarDigital Library

Index Terms

Deep Bayesian Mining, Learning and Understanding
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic inference problems
      1. Bayesian computation

Recommendations

Neural Bayesian Information Processing
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Deep learning is developed as a learning process from source inputs to target outputs where the inference or optimization is performed over an assumed deterministic model with deep structure. A wide range of temporal and spatial data in language and ...
Read More
Deep Bayesian Multimedia Learning
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Deep learning has been successfully developed as a complicated learning process from source inputs to target outputs in presence of multimedia environments. The inference or optimization is performed over an assumed deterministic model with deep ...
Read More
Deep Bayesian Data Mining
WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining

This tutorial addresses the fundamentals and advances in deep Bayesian mining and learning for natural language with ubiquitous applications ranging from speech recognition to document summarization, text classification, text segmentation, information ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2019
3305 pages
ISBN:9781450362016
DOI:10.1145/3292500
General Chairs:
Ankur Teredesai
KenSci
,
Vipin Kumar
University of Minnesota
,
Program Chairs:
Ying Li
EV Analysis Corporation
,
Rómer Rosales
LinkedIn
,
Evimaria Terzi
Boston University
,
George Karypis
University of Minnesota
Copyright © 2019 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2019
Check for updates
Author Tags
bayesian learning
deep learning
natural language processing
Qualifiers
- tutorial
Conference

Acceptance Rates
KDD '19 Paper Acceptance Rate110of1,200submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%
More
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 454
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep Bayesian Mining, Learning and Understanding

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Neural Bayesian Information Processing

Deep Bayesian Multimedia Learning

Deep Bayesian Data Mining

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Deep Bayesian Mining, Learning and Understanding

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Neural Bayesian Information Processing

Deep Bayesian Multimedia Learning

Deep Bayesian Data Mining

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media