Random forest of perfect trees: concept, performance, applications and perspectives - Archive ouverte HAL Access content directly
Journal Articles Bioinformatics Year : 2021

Random forest of perfect trees: concept, performance, applications and perspectives

(1) , (2) , (1) , (3) , (2) , (4) , (5, 6) , (2, 7) , (8) , (1) , (9)
1
2
3
4
5
6
7
8
9

Abstract

Abstract Motivation The principle of Breiman's random forest (RF) is to build and assemble complementary classification trees in a way that maximizes their variability. We propose a new type of random forest that disobeys Breiman’s principles and involves building trees with no classification errors in very large quantities. We used a new type of decision tree that uses a neuron at each node as well as an in-innovative half Christmas tree structure. With these new RFs, we developed a score, based on a family of ten new statistical information criteria, called Nguyen information criteria (NICs), to evaluate the predictive qualities of features in three dimensions. Results The first NIC allowed the Akaike information criterion to be minimized more quickly than data obtained with the Gini index when the features were introduced in a logistic regression model. The selected features based on the NICScore showed a slight advantage compared to the support vector machines—recursive feature elimination (SVM-RFE) method. We demonstrate that the inclusion of artificial neurons in tree nodes allows a large number of classifiers in the same node to be taken into account simultaneously and results in perfect trees without classification errors. Availability and implementation The methods used to build the perfect trees in this article were implemented in the ‘ROP’ R package, archived at https://cran.r-project.org/web/packages/ROP/index.html. Supplementary information Supplementary data are available at Bioinformatics online.
Fichier principal
Vignette du fichier
btab074.pdf (345.84 Ko) Télécharger le fichier
Origin : Publication funded by an institution

Dates and versions

inserm-03546813 , version 1 (28-01-2022)

Identifiers

Cite

Jean-Michel Nguyen, Pascal Jézéquel, Pierre Gillois, Luisa Silva, Faouda Ben Azzouz, et al.. Random forest of perfect trees: concept, performance, applications and perspectives. Bioinformatics, 2021, 37 (15), pp.2165-2174. ⟨10.1093/bioinformatics/btab074⟩. ⟨inserm-03546813⟩
67 View
28 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More