Etude des réseaux KAN pour systèmes embarqués - CEA

Job description: Context: Kolmogorov-Arnold Networks recently receive increased attention [1], as a promising alternative to the traditional Multi-Layer Perceptrons (MLPs) that currently power many deep learning applications. In MLPs, nodes (neurons) perform pre-defined, non-linear activation functions on the weighted sum of their input edges (connections). The linear weights on the connections are adjusted during training. Instead, KAN models have learnable activation functions on edges, which are basis functions for example B-splines. By learning the coefficients of these functions, KANs can achieve highly flexible and expressive activation functions on each connection. Then KAN neurons simply sum up the values on their input edges. The KAN model have potential to offer better performance at a smaller model size. The parametrized basis functions are much more expressive than a series of learned linear projections followed by fixed, non-linear activations. Therefore, less of such expressive functions may be sufficient to construct a model to solve a given problem. Moreover, as the number of basis functions are smaller, the KAN solutions can be more explicable than large MLPs. Nevertheless, KANs have also downsides. As the basis functions are more complex, the literature reports longer training times; furthermore good hyperparameters for the models are more difficult to identify. Another important issue is the risk of overfitting. As the basis functions are more complex, they can easily overfit any data for the trainining dataset. Intership objectives: In this context, the main question addressed in this internship is: what is the potential of Kolmogorov-Arnold Networks in embedded Artificial Intelligence (AI) systems? Are they really more performant and more efficient (in terms of compute resources necessary for inference)? The investigation will be carried out on time-series tasks, such as prediction or detection. A typical candidate application is keyword spotting. Two versions of this application could be studied, evaluated and compared: a KAN [2] and a transformer-based implementation [3]. References: [1] Liu, Ziming, et al. "Kan: Kolmogorov-arnold networks," arXiv preprint arXiv:2404.19756 (2024) [2] Xu, Anfeng, et al. "Effective Integration of KAN for Keyword Spotting," arXiv preprint arXiv:2409.08605 (2024) [3] A. Berg, M. O’Connor, et M. T. Cruz, “Keyword Transformer: A Self-Attention Model for Keyword Spotting,” arXiv preprint arXiv:2104.00769 (2021)

Institute description: This internship will be hosted by the CEA LIST institute in Grenoble. The Laboratory of Multi-sensor Integrated Intelligence (LIIM) is focused on developing algorithms for embedded AI (Artificial Intelligence), data fusion and environmental perception for use in cyber-physical systems. The lab develops software and hardware demonstration platforms that use these algorithms combined with innovative technologies, frequently integrated in custom integrated circuits. This laboratory is located in Grenoble.

Your profile: Compétences recherchées : Fin d'étude d'ingénieur ou Master 2 (Bac+5) Compétences en systèmes embarqués, programmation, calcul numérique Connaissance sur l'intelligence artificielle, réseaux de neurones Programmation Python/C/C++ Expérience sur PyTorch Plan de travail : Familiarisation avec les KAN : publications et outils à l’état de l’art, environnements de programmation (basés sur Pytorch) Définition de métriques pour estimer le coût de mise en œuvre matérielle Proposition d’une méthode pour déterminer le coût de calcul des réseaux existants- Evaluation des KAN et DNN conventionnels pour un problème spécifique basé sur les outils de profilage existants Formulation d’un compromis coût du matériel par rapport aux performances pour les KAN (précision, taux d'erreur de détection,…) Rédaction du rapport final, préparation de la soutenance de stage

Level of qualification studied: Bac+5 - Master 2

Languages: English Intermediate