Changelog
Source:NEWS.md
bolasso 0.3.0
CRAN release: 2024-12-08
New Features
-
Fast Estimation Mode:
-
bolasso()
gains afast
argument which optimizes computation by using a single cross-validated regression on the entire dataset to determine the optimal regularization parameter (lambda
). This approach bypasses the need for cross-validation within each bootstrap replicate, drastically reducing computation time, especially beneficial for large datasets or when using a high number of bootstrap replicates.# Fast mode reduces computation time by using a single cross-validated lambda model_fast <- bolasso( diabetes ~ ., data = train, n.boot = 1000, progress = FALSE, family = "binomial", fast = TRUE )
-
-
Enhanced Variable Selection Methods:
selected_vars()
is now a shorthand forselected_variables()
.-
selected_variables()
/selected_vars()
supports two variable selection algorithms via themethod
argument: the Variable Inclusion Probability (VIP) method and the Quantile (QNT) method. The VIP method selects variables that appear in a high percentage of bootstrap models, while the QNT method selects variables based on bootstrap confidence intervals. Setmethod = "vip"
ormethod = "qnt"
, respectively.# Select variables using the VIP method with a 95% threshold selected_vars_vip <- selected_variables(model, threshold = 0.95, method = "vip") # Select variables using the QNT method selected_vars_qnt <- selected_variables(model, threshold = 0.95, method = "qnt")
- Tidy Method for Bolasso Objects:
-
Variable Selection Visualization:
-
plot_selection_thresholds()
provides a visual representation of the selection thresholds for each variable. This visualization helps users understand the stability and robustness of variable selection across different thresholds and methods.# Visualize selection thresholds for variables plot_selection_thresholds(model, select = "lambda.min")
-
Improvements
-
Plotting Coefficient Distributions:
-
plot_selected_variables()
visualizes the coefficient distributions for only the selected variables. This function provides a focused view of the most relevant variables in the model.# Plot coefficient distributions for selected variables plot_selected_variables( model, threshold = 0.95, method = "vip", select = "lambda.min" )
-
plot
visualizes the coefficient distributions for all model covariates.# Plot coefficient distributions for selected variables plot(model, select = "lambda.min")
-
bolasso 0.2.0
CRAN release: 2022-05-09
- Added a
NEWS.md
file to track changes to the package. -
bolasso()
argumentform
has been renamed toformula
to reflect common naming conventions in R statistical modeling packages. -
predict()
andcoef()
methods are now implemented usingfuture.apply::future_lapply
allowing for computing predictions and extracting coefficients in parallel. This may result in slightly worse performance (due to memory overhead) when the model/prediction data is small but will be significantly faster when e.g. generating predictions on a very large data-set. - Solved an issue with
bolasso()
argumentformula
. The user-supplied value offormula
is handled viadeparse()
which has a defaultwidth.cutoff
value of 60. This was causing issues with formulas by splitting them into multi-element character vectors. It has now been set to the maximum value of500L
which will correctly parse all lengths of formulas. -
predict()
now forces evaluation of theformula
argument in thebolasso()
call. This resolves an issue where, if a user passes a formula via a variable,predict()
would pass the variable name to the underlying prediction function as opposed to the actual formula.