As suggested by @avehtari, it would be good to have $R^2$ as a performance statistic in projpred. This could be called stats = "R2" (and stat = "R2" for suggest_size()), for example. According to @avehtari, we should go for LOO - $R^2$.
There is also related code at
|
if (stat == "r2") { |
|
if (!is.null(mu.bs)) { |
|
y <- mu.bs |
|
} else { |
|
y <- d_test$y |
|
} |
|
eloo <- mu - y |
|
n <- length(y) |
|
rd <- bayesboot::rudirichlet(4000, n) |
|
vary <- (rowSums(sweep(rd, 2, y^2, FUN = "*")) - |
|
rowSums(sweep(rd, 2, y, FUN = "*"))^2) * (n / (n - 1)) |
|
vareloo <- (rowSums(sweep(rd, 2, eloo^2, FUN = "*")) - |
|
rowSums(sweep(rd, 2, eloo, FUN = "*")^2)) * (n / (n - 1)) |
|
looR2 <- 1 - vareloo / vary |
|
looR2[looR2 < -1] <- -1 |
|
looR2[looR2 > 1] <- 1 |
|
value <- median(looR2) |
|
value.se <- sd(looR2) |
(Note that
* (n / (n - 1) can be omitted because it cancels out.) In those lines,
bayesboot::rudirichlet() is used. According to
@avehtari, the SE could also be calculated without a Dirichlet approach, using the formula from
stan-dev/loo#205 (comment).
As suggested by @avehtari, it would be good to have$R^2$ as a performance statistic in projpred. This could be called $R^2$ .
stats = "R2"(andstat = "R2"forsuggest_size()), for example. According to @avehtari, we should go for LOO -There is also related code at
projpred/R/summary_funs.R
Lines 170 to 187 in bec6258
* (n / (n - 1)can be omitted because it cancels out.) In those lines,bayesboot::rudirichlet()is used. According to @avehtari, the SE could also be calculated without a Dirichlet approach, using the formula from stan-dev/loo#205 (comment).