Importance variable avec ranger

Question

J'ai formé une forêt aléatoire en utilisant caret + ranger.

fit <- train( y ~ x1 + x2 ,data = total_set ,method = "ranger" ,trControl = trainControl(method="cv", number = 5, allowParallel = TRUE, verbose = TRUE) ,tuneGrid = expand.grid(mtry = c(4,5,6)) ,importance = 'impurity' )

Maintenant, j'aimerais voir l'importance des variables. Cependant, aucun de ces travaux:

> importance(fit) Error in UseMethod("importance") : no applicable method for 'importance' applied to an object of class "c('train', 'train.formula')" > fit$variable.importance NULL > fit$importance NULL > fit Random Forest 217380 samples 32 predictors No pre-processing Resampling: Cross-Validated (5 fold) Summary of sample sizes: 173904, 173904, 173904, 173904, 173904 Resampling results across tuning parameters: mtry RMSE Rsquared 4 0.03640464 0.5378731 5 0.03645528 0.5366478 6 0.03651451 0.5352838 RMSE was used to select the optimal model using the smallest value. The final value used for the model was mtry = 4.

Une idée si et comment je peux l'obtenir?

Merci.

Tchotchke · Accepted Answer

varImp(fit) l'obtiendra pour vous.

Pour comprendre cela, j'ai regardé names(fit), ce qui m'a conduit à names(fit$modelInfo) - alors vous verrez varImp comme l'une des options.

Polina Mamoshina · Answer

Pour le package "ranger", vous pouvez appeler une importance avec

fit$variable.importance

En remarque, vous pouvez voir toutes les sorties disponibles pour le modèle en utilisant str ()

str(fit)

NaNxT · Answer

selon @fmalaussena

set.seed(123) ctrl <- trainControl(method = 'cv', number = 10, classProbs = TRUE, savePredictions = TRUE, verboseIter = TRUE) rfFit <- train(Species ~ ., data = iris, method = "ranger", importance = "permutation", #*** trControl = ctrl, verbose = T)

Vous pouvez passer soit "permutation" ou "impurity" à l'argument importance. La description des deux valeurs peut être trouvée ici: https://alexisperrier.com/datascience/2015/08/27/feature-importance-random-forests-gini-accuracy.html