feat(Prophet-coeffs): DS-2469 store prophet coeffs#11
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
| if (!is.null(InputCollect$prophet_custom_output) && | ||
| !is.null(InputCollect$prophet_custom_output$prophet_coefficients)) { | ||
| prophet_coefs <- InputCollect$prophet_custom_output$prophet_coefficients | ||
| if (!is.null(prophet_coefs) && nrow(prophet_coefs) > 0) { | ||
| write.csv(prophet_coefs, paste0(plot_folder, "prophet_regressor_coefficients.csv"), row.names = TRUE) | ||
| } | ||
| } |
There was a problem hiding this comment.
I believe this can be refactored to this
prophet_coefs <- InputCollect$prophet_custom_output$prophet_coefficients
if (!is.null(prophet_coefs) && nrow(prophet_coefs) > 0) {
write.csv(
prophet_coefs,
file = paste0(plot_folder, "prophet_regressor_coefficients.csv"),
row.names = TRUE
)
}There was a problem hiding this comment.
Thanks, changes applied.
| #! EA START | ||
| # Extract prophet regressor coefficients | ||
| prophet_coefficients <- NULL | ||
| if (!is.null(prophet_model) && !is.null(prophet_model$params) && !is.null(prophet_model$params$beta)) { | ||
| # Get regressor names from extra_regressors | ||
| regressor_names <- names(prophet_model$extra_regressors) | ||
|
|
||
| if (length(regressor_names) > 0 && ncol(prophet_model$params$beta) > 0) { | ||
| # Extract beta coefficients (mean across samples) | ||
| # beta is a matrix: rows are samples, columns are regressors | ||
| beta_matrix <- prophet_model$params$beta | ||
|
|
||
| # Get the column indices for regressors (skip trend, seasonality components) | ||
| # Regressors start after the base components | ||
| n_base_components <- ncol(beta_matrix) - length(regressor_names) | ||
| regressor_indices <- (n_base_components + 1):ncol(beta_matrix) | ||
|
|
||
| if (length(regressor_indices) == length(regressor_names)) { | ||
| # Calculate mean coefficient for each regressor across samples | ||
| regressor_coefs <- colMeans(beta_matrix[, regressor_indices, drop = FALSE]) | ||
|
|
||
| # Create data frame with regressor names and coefficients | ||
| prophet_coefficients <- data.frame( | ||
| regressor = regressor_names, | ||
| coefficient = as.numeric(regressor_coefs), | ||
| stringsAsFactors = FALSE | ||
| ) | ||
| } else { | ||
| # Fallback: try to match by position or extract all extra regressor columns | ||
| # Prophet stores regressors in extra_regressors, and their coefficients in beta | ||
| # The order should match | ||
| if (ncol(beta_matrix) >= length(regressor_names)) { | ||
| # Take the last columns matching the number of regressors | ||
| start_idx <- ncol(beta_matrix) - length(regressor_names) + 1 | ||
| regressor_coefs <- colMeans(beta_matrix[, start_idx:ncol(beta_matrix), drop = FALSE]) | ||
|
|
||
| prophet_coefficients <- data.frame( | ||
| regressor = regressor_names, | ||
| coefficient = as.numeric(regressor_coefs), | ||
| stringsAsFactors = FALSE | ||
| ) | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
quite a bit of nested if's ..try this?
# Returns a data.frame with columns: regressor, coefficient
# or NULL if coefficients/regressors can't be extracted.
extract_prophet_regressor_coefs <- function(prophet_model) {
# --- Guard clauses (bail early) ---
if (is.null(prophet_model)) return(NULL)
if (is.null(prophet_model$params) || is.null(prophet_model$params$beta)) return(NULL)
if (is.null(prophet_model$extra_regressors)) return(NULL)
regressor_names <- names(prophet_model$extra_regressors)
if (length(regressor_names) == 0) return(NULL)
beta_matrix <- prophet_model$params$beta
if (is.null(dim(beta_matrix)) || ncol(beta_matrix) == 0) return(NULL)
# --- Helper to build the output ---
build_df <- function(coefs) {
data.frame(
regressor = regressor_names,
coefficient = as.numeric(coefs),
stringsAsFactors = FALSE
)
}
# --- Preferred path: regressors are the last K columns ---
k <- length(regressor_names)
if (ncol(beta_matrix) < k) return(NULL)
# In most Prophet implementations, extra regressors live in the last K beta columns.
start_idx <- ncol(beta_matrix) - k + 1
regressor_cols <- start_idx:ncol(beta_matrix)
coefs <- colMeans(beta_matrix[, regressor_cols, drop = FALSE])
build_df(coefs)
}i think prompting a message like in line 761 would be helpful
There was a problem hiding this comment.
Yeah that's a good structure, thanks for the suggestion.
I followed through, and I added some extra stuff, I tried to follow the same format you suggested because I think is cleaner.
|
Thanks for making the changes! I think one more sanity check that all regressors are accounted for also have we checked how robyn names factors if the values have underscores or spaces? like if the factor values were "Test Name 1" "test-name-2" "TEST_NAME_3" ? just want to make sure we got the name conversion logic down so it doesn't bite us downstream. |
In the case the factor variables have an underscore ( |
sallyhong
left a comment
There was a problem hiding this comment.
Thanks for addressing all my points for Robyn2.
I think everything should be fine but if we catch something weird in ds-scenario-modeling we can always revisit :)
Sally


SUMMARY
When a Robyn run is executed, and
factor_varsare specified, a new CSV file (prophet_regressor_coefficients.csv) will be created within the Robyn init folder. The motivation for saving the prophet regressors, comes with the need of handling categorical variables within Scenario Modeling predict flow, for which prophet regressors of categorical variables are needed to transform the values.The regressors will be named as
<variable>_<value>, where<value>is each value of the categorical variable present in the dataset.Example of
prophet_regressor_coefficients.csvwithfactor_vars = c('Marathons', 'RandomVariable'):It's worth noting, that in the original dataset that generated the above example, we have the values
0,1, and2for the variableMarathons, and valuesX,Y, andZfor the variableRandomVariable. Prophet takes one of the values as reference and omits it at the moment of the coefficient generation.R/R/inputs.R(outlined by comments: #! EA START, !# EA END).R/R/outputs.R.NOTE: This changes' output were envisioned for the changes mentioned on this other PR
STORY NUMBER and LINK
DS-2469