The goal of the project is to identify, analyze and evaluate mechanisms for modulating the spatio-temporal sparsity of activation functions in order to minimize the computational load of transformer NN model (learning/inference). A combined approach with extreme quantization will also be considered. The aim is to jointly refine an innovative strategy to assess the impacts and potential gains of these mechanisms on the model execution under hardware constraints. In particular, this co-design should also enable to qualify and to exploit a bidirectional feedback loop between a targeted neural network and a hardware instantiation to achieve the best tradeoff (compactness/latency).
PhD in AI for embedded systems with experience in optimized neural network design targeting digital components