Loading web-font TeX/Main/Regular
A 73.8k-Inference/mJ SVM Learning Accelerator for Brain Pattern Recognition | IEEE Journals & Magazine | IEEE Xplore

A 73.8k-Inference/mJ SVM Learning Accelerator for Brain Pattern Recognition


Abstract:

Machine learning (ML) has been widely adopted in neural signal processing and support vector machine (SVM) stands out for its efficacy given limited training data. The co...Show More

Abstract:

Machine learning (ML) has been widely adopted in neural signal processing and support vector machine (SVM) stands out for its efficacy given limited training data. The constrained battery capacity of implanted devices necessitates a dedicated accelerator with high energy efficiency. This work presents an energy-efficient SVM learning accelerator for brain pattern recognition. By employing the cluster-partitioning SVM (CP-SVM) algorithm, this work achieves up to 99% and 91% latency reductions for training and inference, respectively, compared to the conventional SVMs. Efficient hardware mapping is achieved through algorithm and architecture co-optimizations. Kernel transformation reduces the processing element (PE) array’s hardware complexity by 42%. Sparsity-aware skipping eliminates redundant computations, leading to latency reduction. The design space of the PE array is explored to minimize the hardware cost. Data scheduling is applied to improve the PE utilization. Overall, the processing latency for the PE array is reduced by 96%. For PE array implementation, the area of the data exchanger is reduced by 93% by utilizing the chained interconnect. By integrating multiple sorters into one cross-cluster sorter, the sorter area is reduced by 52%. Fabricated in a 40-nm CMOS technology, the proposed SVM learning processor dissipates 9.68 mW at 40 MHz from a supply voltage of 0.85 V. The chip achieves the energy efficiency of 73.8k inference/mJ and 811 training/mJ, exceling prior arts by over 3.4{\times } and 6.9{\times } , respectively. It also delivers the area efficiency of 510k inference/s/mm2 and 5.6k training/s/mm2, outperforming the state-of-the-art by over 19.3{\times } and 40.9{\times } , respectively.
Published in: IEEE Journal of Solid-State Circuits ( Volume: 59, Issue: 10, October 2024)
Page(s): 3357 - 3365
Date of Publication: 18 June 2024

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.