Low precision quantization of attention based neural network for embedded devices

  • Artificial Intelligence & data intelligence,
  • phD
  • CEA-List
  • Paris – Saclay
  • Level 7
  • 2024-03-01

Deploying artificial intelligence (AI) represents a major challenge. Over the last years, AI has developed using increasingly large neural networks and massive data processing. Today, the challenge is to adapt these methods to run on small embedded components and as close as possible to industrial solutions. The research question adressed here is how to make neural networks as frugal as possible, so that they can be applied to embedded systems. This involves rethinking models to make them much more compact and efficient, using adapted topologies and compression methods, as well as coding information in a way that is suitable for inference on embedded targets. More specifically, the candidate will be interested in neural networks based on the attention mechanism, such as Transformer networks. He will propose new compression methods adapted to these neural network models, based for example on quantization or distillation. The candidate will focus on the compatibility of the methods he proposes to make the networks embeddable on a hardware target. With this in mind, he will propose encodings adapted to hardware targets.

Master ou diplôme dapos;ingénieur en informatique/électronique.


Contact us

We will reply as soon as possible...