Weight quantization in Boltzmann machines

doi:10.1016/0893-6080(91)90077-I

Neural Networks

Volume 4, Issue 3, 1991, Pages 405-409

https://doi.org/10.1016/0893-6080(91)90077-I Get rights and content

Abstract

Hardware implementations of neural networks promise high computing power at moderate system complexity as compared to software simulations. Their design, however, must consider constraints imposed by currently available technology. Quantization of interconnection weights is the single most important such constraint. In this work, we describe simulations carried out for neural networks based on the Boltzmann Machine paradigm, on the impact of weight discretization, and choice of network architecture on performance. Our results show that this type of network is well-suited for operation with only a small number of weight levels.

References (7)

D.H. Ackley et al.
Cognitive Science
(1985)
J. Alspector et al.
Advanced research in VLSI
N.H. Farhat
IEEE Circ. and Dev. Magazine
(1989, September)

There are more references available in the full text version of this article.

Cited by (34)

Robust open-set classification for encrypted traffic fingerprinting
2023, Computer Networks
Encrypted network traffic has been known to leak information about their underlying content through side-channel information leaks. Traffic fingerprinting attacks exploit this by using machine learning techniques to threaten user privacy by identifying user activities such as website visits, videos streamed, and messenger app activities. Although state-of-the-art traffic fingerprinting attacks have high performances, even undermining the latest defenses, most of them are developed under the closed-set assumption. To deploy them in practical situations, it is important to adapt them to the open-set scenario, which allows the attacker to identify its target content while rejecting other background traffic. At the same time, in practice, these models need to be deployed on in-networking devices such as programmable switches, which have limited memory and computation power. Model weight quantization can reduce the memory footprint of deep learning models while at the same time, allowing inference to be done as integer operations as opposed to floating point operations. Open-set classification in the domain of traffic fingerprinting has not been explored well in prior work and none of them explored the effect of quantization on the open-set performance of such models. In this work, we propose a framework for robust open-set classification of encrypted traffic based on three key ideas. First, we show that a well-regularized deep learning model improves the open-set classification and then we propose a novel open-set classification method with three variants that perform consistently over multiple datasets. Next, we show that traffic fingerprinting models can be quantized without a significant drop in both closed-set and open-set accuracy and therefore, they can be readily deployed on in-network computing devices. Finally, we show that when the above three components are combined, the resulting open-set classifier outperforms all other open-set classification methods evaluated across five datasets with a minimum and maximum increase in F1_Score of 8.9% and 77.3% respectively.
Single-shot pruning and quantization for hardware-friendly neural network acceleration
2023, Engineering Applications of Artificial Intelligence
Applying CNN on embedded systems is challenging due to model size limitations. Pruning and quantization can help, but are time-consuming to apply separately. Our Single-Shot Pruning and Quantization strategy addresses these issues by quantizing and pruning in a single process. We evaluated our method on CIFAR-10 and CIFAR-100 datasets for image classification. Our model is 69.4% smaller with little accuracy loss, and runs 6–8 times faster on NVIDIA Xavier NX hardware.
Pruning and quantization for deep neural network acceleration: A survey
2021, Neurocomputing
Citation Excerpt :
Compressing CNNs by reducing precision values has been previously proposed. Converting floating-point parameters into low numerical precision datatypes for quantizing neural networks was proposed as far back as the 1990s [67,14]. Renewed interest in quantization began in the 2010s when 8-bit weight values were shown to accelerate inference without a significant drop in accuracy [233].
Deep neural networks have been applied in many applications exhibiting extraordinary abilities in the field of computer vision. However, complex network architectures challenge efficient real-time deployment and require significant computation resources and energy costs. These challenges can be overcome through optimizations such as network compression. Network compression can often be realized with little loss of accuracy. In some cases accuracy may even improve. This paper provides a survey on two types of network compression: pruning and quantization. Pruning can be categorized as static if it is performed offline or dynamic if it is performed at run-time. We compare pruning techniques and describe criteria used to remove redundant computations. We discuss trade-offs in element-wise, channel-wise, shape-wise, filter-wise, layer-wise and even network-wise pruning. Quantization reduces computations by reducing the precision of the datatype. Weights, biases, and activations may be quantized typically to 8-bit integers although lower bit width implementations are also discussed including binary neural networks. Both pruning and quantization can be used independently or combined. We compare current techniques, analyze their strengths and weaknesses, present compressed network accuracy results on a number of frameworks, and provide practical guidance for compressing networks.
Graph Structure Learning-Based Compression Method for Convolutional Neural Networks
2024, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Lightweight network with masks for light field image super-resolution based on swin attention
2024, Multimedia Tools and Applications
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
2023, ACM Transactions on Intelligent Systems and Technology

View all citing articles on Scopus

View full text

Original contributionWeight quantization in Boltzmann machines

Abstract

Cognitive Science

Advanced research in VLSI

IEEE Circ. and Dev. Magazine

Original contribution
Weight quantization in Boltzmann machines