Creates object with class arena_live
or arena_static
depending on the first argument.
This method is always first in arenar
workflow and you should specify all plots' parameters there.
create_arena( live = FALSE, N = 500, fi_N = NULL, fi_B = 10, grid_points = 101, shap_B = 10, funnel_nbins = 5, funnel_cutoff = 0.01, funnel_factor_threshold = 7, fairness_cutoffs = seq(0.05, 0.95, 0.05), max_points_number = 150, distribution_bins = seq(5, 40, 5), enable_attributes = TRUE, enable_custom_params = TRUE, cl = NULL )
live | Defines if arena should start live server or generate static json |
---|---|
N | number of observations used to calculate dependence profiles |
fi_N | number of observations used in feature importance |
fi_B | Number of permutation rounds to perform each variable in feature importance |
grid_points | number of points for profile |
shap_B | Numer of random paths in SHAP |
funnel_nbins | Number of partitions for numeric columns for funnel plot |
funnel_cutoff | Threshold for categorical data. Entries less frequent than specified value will be merged into one category in funnel plot. |
funnel_factor_threshold | Numeric columns with lower number of unique values than value of this parameter will be treated as factors in funnel plot. |
fairness_cutoffs | vector of available cutoff levels for fairness panel |
max_points_number | maximum size of sample to plot scatter plots in variable against another panel |
distribution_bins | vector of available bins count for histogram |
enable_attributes | Switch for generating attributes of observations and variables. It is required for custom params. Attributes can increase size of static Arena. |
enable_custom_params | Switch to allowing user to modify observations and generate plots for them. |
cl | Cluster used to run parallel computations (Do not work in live Arena) |
Empty arena_static
or arena_live
class object.
arena_static
:
explainer List of used explainers
observations_batches List of data frames added as observations
params Plots' parameters
plots_data List of generated data for plots
arena_live
:
explainer List of used explainers
observations_batches List of data frames added as observations
params Plots' parameters
timestamp Timestamp of last modification
#>#> #> #>library("arenar") library("dplyr", quietly=TRUE, warn.conflicts = FALSE) # create a model model <- glm(m2.price ~ ., data=apartments) # create a DALEX explainer explainer <- DALEX::explain(model, data=apartments, y=apartments$m2.price)#> Preparation of a new explainer is initiated #> -> model label : lm ( default ) #> -> data : 1000 rows 6 cols #> -> target variable : 1000 values #> -> predict function : yhat.glm will be used ( default ) #> -> predicted values : numerical, min = 1781.848 , mean = 3487.019 , max = 6176.032 #> -> model_info : package stats , ver. 4.0.2 , task regression ( default ) #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -247.4728 , mean = 2.093656e-14 , max = 469.0023 #> A new explainer has been created!# prepare observations to be explained observations <- apartments[1:3, ] # rownames are used as labels for each observation rownames(observations) <- paste0(observations$construction.year, "-", observations$surface, "m2") # generate static arena for one model and 3 observations arena <- create_arena(live=FALSE) %>% push_model(explainer) %>% push_observations(observations) print(arena)#> ===== Static Arena Summary ===== #> Observations: 1953-25m2, 1992-143m2, 1937-56m2 #> Variables: construction.year, surface, floor, no.rooms, district #> Models: lm #> Datasets: #> Plots count: 36