Parallelization of Recurrent Neural Network training algorithm with implicit aggregation on multi-core architectures

Thomas Messi Nguelé; Armel Jacques Nzekon Nzeko'o; Damase Donald Onana

Pré-Publication, Document De Travail Année : 2024

Parallelization of Recurrent Neural Network training algorithm with implicit aggregation on multi-core architectures

(1, 2, 3) , (1, 3) , (1)

1
2
3

Thomas Messi Nguelé

Fonction : Auteur correspondant
PersonId : 1147377

Connectez-vous pour contacter l'auteur

University of Yaoundé [Cameroun]

University of Ebolowa

Unité de modélisation mathématique et informatique des systèmes complexes [Bondy]

Armel Jacques Nzekon Nzeko'o

Fonction : Auteur
PersonId : 1369791

University of Yaoundé [Cameroun]

Unité de modélisation mathématique et informatique des systèmes complexes [Bondy]

Damase Donald Onana

Fonction : Auteur
PersonId : 1374247

University of Yaoundé [Cameroun]

Résumé

Recent work has shown that deep learning algorithms are efficient for various tasks, whether in Natural Language Processing (NLP) or in Computer Vision (CV). One of the particularities of these algorithms is that they are so efficient as the amount of data used is large. However, sequential execution of these algorithms on large amounts of data can take a very long time. In this paper, we consider the problem of training Recurrent Neural Network (RNN) for hate (aggressive) messages detection task. We first compared the sequential execution of three variants of RNN, we have shown that Long Short Time Memory (LSTM) provides better metric performance, but implies more important execution time in comparison with Gated Recurrent Unit (GRU) and standard RNN. To have both good metric performance and reduced execution time, we proceeded to a parallel implementation of the training algorithms. We proposed a parallel algorithm based on an implicit aggregation strategy in comparison to the existing approach which is based on a strategy with an aggregation function. We have shown that the convergence of this proposed parallel algorithm is close to that of the sequential algorithm. The experimental results on an 32-core machine at 1.5 GHz and 62 Go of RAM show that better results are obtained with the parallelization strategy that we proposed. For example, with an LSTM on a dataset having more than 100k comments, we obtained an f-measure of 0.922 and a speedup of 7 with our approach, compared to a f-measure of 0.874 and a speedup of 5 with an explicit aggregation between workers.

Mots clés

Deep Learning Recurrent Neural Network hateful messages recognition parallel programming

Domaines

Informatique [cs] Calcul parallèle, distribué et partagé [cs.DC] Machine Learning [stat.ML]

Fichier principal

ARIMA_messi_nzekon_onana_11_04_2024.pdf (506.21 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Thomas Messi Nguélé : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-04542984

Soumis le : jeudi 11 avril 2024-18:01:23

Dernière modification le : dimanche 14 avril 2024-03:18:41

Dates et versions

hal-04542984 , version 1 (11-04-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04542984 , version 1

Citer

Thomas Messi Nguelé, Armel Jacques Nzekon Nzeko'o, Damase Donald Onana. Parallelization of Recurrent Neural Network training algorithm with implicit aggregation on multi-core architectures. 2024. ⟨hal-04542984⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IRD AFRIQ SORBONNE-UNIVERSITE SU-SCIENCES UMI-209

0 Consultations

0 Téléchargements

Parallelization of Recurrent Neural Network training algorithm with implicit aggregation on multi-core architectures

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager