% Generated by roxygen2: do not edit by hand % Please edit documentation in R/xgb.ggplot.R, R/xgb.plot.importance.R \name{xgb.ggplot.importance} \alias{xgb.ggplot.importance} \alias{xgb.plot.importance} \title{Plot feature importance as a bar graph} \usage{ xgb.ggplot.importance( importance_matrix = NULL, top_n = NULL, measure = NULL, rel_to_first = FALSE, n_clusters = c(1:10), ... ) xgb.plot.importance( importance_matrix = NULL, top_n = NULL, measure = NULL, rel_to_first = FALSE, left_margin = 10, cex = NULL, plot = TRUE, ... ) } \arguments{ \item{importance_matrix}{a \code{data.table} returned by \code{\link{xgb.importance}}.} \item{top_n}{maximal number of top features to include into the plot.} \item{measure}{the name of importance measure to plot. When \code{NULL}, 'Gain' would be used for trees and 'Weight' would be used for gblinear.} \item{rel_to_first}{whether importance values should be represented as relative to the highest ranked feature. See Details.} \item{n_clusters}{(ggplot only) a \code{numeric} vector containing the min and the max range of the possible number of clusters of bars.} \item{...}{other parameters passed to \code{barplot} (except horiz, border, cex.names, names.arg, and las).} \item{left_margin}{(base R barplot) allows to adjust the left margin size to fit feature names. When it is NULL, the existing \code{par('mar')} is used.} \item{cex}{(base R barplot) passed as \code{cex.names} parameter to \code{barplot}.} \item{plot}{(base R barplot) whether a barplot should be produced. If FALSE, only a data.table is returned.} } \value{ The \code{xgb.plot.importance} function creates a \code{barplot} (when \code{plot=TRUE}) and silently returns a processed data.table with \code{n_top} features sorted by importance. The \code{xgb.ggplot.importance} function returns a ggplot graph which could be customized afterwards. E.g., to change the title of the graph, add \code{+ ggtitle("A GRAPH NAME")} to the result. } \description{ Represents previously calculated feature importance as a bar graph. \code{xgb.plot.importance} uses base R graphics, while \code{xgb.ggplot.importance} uses the ggplot backend. } \details{ The graph represents each feature as a horizontal bar of length proportional to the importance of a feature. Features are shown ranked in a decreasing importance order. It works for importances from both \code{gblinear} and \code{gbtree} models. When \code{rel_to_first = FALSE}, the values would be plotted as they were in \code{importance_matrix}. For gbtree model, that would mean being normalized to the total of 1 ("what is feature's importance contribution relative to the whole model?"). For linear models, \code{rel_to_first = FALSE} would show actual values of the coefficients. Setting \code{rel_to_first = TRUE} allows to see the picture from the perspective of "what is feature's importance contribution relative to the most important feature?" The ggplot-backend method also performs 1-D clustering of the importance values, with bar colors corresponding to different clusters that have somewhat similar importance values. } \examples{ data(agaricus.train) bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 3, eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic") importance_matrix <- xgb.importance(colnames(agaricus.train$data), model = bst) xgb.plot.importance(importance_matrix, rel_to_first = TRUE, xlab = "Relative importance") (gg <- xgb.ggplot.importance(importance_matrix, measure = "Frequency", rel_to_first = TRUE)) gg + ggplot2::ylab("Frequency") } \seealso{ \code{\link[graphics]{barplot}}. }