Ultra-Low Energy Approximate Computing for Embedded Machine Learning

par Xuecan Yang

Projet de thèse en Information, communications, électronique

Sous la direction de Lirida Naviner, Sumanta Chaudhuri et de Laurence Likforman sulem.

Thèses en préparation à l'Institut polytechnique de Paris , dans le cadre de Ecole Doctorale de l'Institut Polytechnique de Paris , en partenariat avec LTCI - Laboratoire de Traitement et Communication de l'Information (laboratoire) , SSH – Secure and Safe Hardware (equipe de recherche) et de Télécom ParisTech (établissement de préparation de la thèse) depuis le 01-10-2017 .


  • Résumé

    Over 90% of processing in CNNs are convolutions. CNNs comprise of large number of layers, with millions of filter weights. e.g AlexNet [5] requires 666 million MACs (Multiply-Accumulate) for an image size of 227x227, and 4.6 MB storage for filter coefficients. Thus energy is consumed in both compute and communication (data movement). The existing approaches [1] try to reduce data movement by using a systolic array of MACs on chip, and also using data compression. The stochastic computing paradigm, is well suited for CNNs as the multiply operation in stochastic computing is done with only a single a “AND” gate. Thus a lot of (MAC) computing can be done on chip at one time compared to 32bit binary multipliers. The challenges of stochastic computing are long bitstream lengths to achieve required precision, and storage. We address these issues in our research project. Some possible solutions are use of a hybrid binary/stochastic scheme, efficient stochastic number generators and converters for efficient storage.

  • Titre traduit

    Ultra-Low Energy Approximate Computing for Embedded Machine Learning


  • Résumé

    Method: a) Precision analysis for Stochastic/binary Computing Primitives: The stochastic multipliers/generators, are error prone. Thus an error model is essential for estimation of system level errors (image recognition error), and optimization.The goal is to predict CNN classification accuracy based on primitive error models. b) Design of basic blocks and optimization for energy-accuracy trade-off: We follow a bottom-up approach in this PhD. Thesis. The basic blocks e.g stochastic stream generators (either pseudo random, or spintronic), the stochastic/binary MACs, data compression, programmable elements are designed in this step. 
 c) Fabrication & Test of a prototype: The final step of our PhD. Thesis is to fabricate a prototype, capable of implementing standard academic benchmarks such as AlexNet [5]. The final will be to compare the energy efficiency with published research.