Skip to content

Commit 15be805

Browse files
Better handling of memoise cache across knitted formats
1 parent ee3ef8a commit 15be805

File tree

5 files changed

+67
-92
lines changed

5 files changed

+67
-92
lines changed

11-funcs-adv/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Ignore .rcache directory
2+
.rcache

11-funcs-adv/.rcache/.gitignore

Lines changed: 0 additions & 6 deletions
This file was deleted.

11-funcs-adv/11-funcs-adv.Rmd

Lines changed: 16 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author:
77
# date: Lecture 11 #"`r format(Sys.time(), '%d %B %Y')`"
88
output:
99
html_document:
10-
theme: flatly
10+
theme: journal
1111
highlight: haddock
1212
# code_folding: show
1313
toc: yes
@@ -344,31 +344,30 @@ As expected, this only took (5 $\times$ 2 = ) 10 seconds to generate the new res
344344

345345
### Aside 1: Caching across R sessions
346346

347-
```{r pdf_cache1, echo = FALSE}
348-
if (knitr::is_latex_output()){
349-
message(
350-
"Note: The benchmarked timing in this section might not line up with the text if you are viewing
351-
the PDF version of the lecture notes."
352-
)
353-
}
354-
```
355-
356-
357347
The previous paragraph elides an important caveat: The default `memoise()` cache is only valid for the current R session. You can see this more clearly by exploring the help documentation of the function, where you will note the internal `cache = cache_memory()` argument. To enable caching that persists across sessions --- including when your computer crashes --- you need to specify a dedicated cache directory with `cache = cache_filesystem(PATH)`. This directory can be located anywhere on your system (or, indeed, on a linked cloud storage service) and you can even have multiple cache directories for different projects. My only modest recommendation is that you use a `.rcache/` naming pattern to keep things orderly.
358348

359349
For example, we can specify a new, persistent memoise cache location for our `slow_square()` function within this lecture sub-directory as follows.
360350

361-
```{r cache_dir, dependson=slow_square}
362-
## Cache directory path (which I've already created)
351+
```{r clear_rcache, include=FALSE, cache=FALSE}
352+
## Internal code chunk to ensure consistent rcache output when re-knitting
363353
cache_dir = here("11-funcs-adv/.rcache")
354+
if (dir.exists(cache_dir)) unlink(cache_dir, recursive = TRUE)
355+
```
356+
357+
```{r cache_dir, cache=FALSE}
358+
## Cache directory path
359+
cache_dir = here("11-funcs-adv/.rcache")
360+
361+
## Create this directory if it doesn't yet exist
362+
if (!dir.exists(cache_dir)) dir.create(cache_dir)
364363
365364
## (Re-)memoise our function with the persistent cache location
366365
mem_square_persistent = memoise(slow_square, cache = cache_filesystem(cache_dir))
367366
```
368367

369368
Run our new memoised function and check that it saved the cached output to the specified directory.
370369

371-
```{r m4}
370+
```{r m4, cache=FALSE}
372371
m4 = map_df(1:7, mem_square_persistent)
373372
374373
cached_files = list.files(cache_dir)
@@ -379,21 +378,12 @@ cached_files
379378

380379
### Aside 2: Verbose output
381380

382-
```{r pdf_cache2, echo = FALSE}
383-
if (knitr::is_latex_output()){
384-
message(
385-
"Note: The benchmarked timing in this section might not line up with the text if you are viewing
386-
the PDF version of the lecture notes."
387-
)
388-
}
389-
```
390-
391381
It's possible (and often very helpful) to add verbose prompts to our memoised functions. Consider the code below, which folds our `mem_square_persistent()` function into two sections:
392382

393383
1. Check for and load previously cached results. Print the results to screen.
394384
2. Run our memoised function on any inputs that have not already been evaluated.( These results will be cached in turn for future use.) Again, print the results to screen.
395385

396-
```{r mem_square_verbose, dependson=mem_square_persistent}
386+
```{r mem_square_verbose, cache=FALSE}
397387
mem_square_verbose =
398388
function(x) {
399389
## 1. Load cached data if already generated
@@ -414,15 +404,15 @@ mem_square_verbose =
414404

415405
And here's an example of the verbose function in action. The output is probably less impressive in a knitted R Markdown document, but I find the real-time feedback to be very informative in a live session. (Try it yourself.)
416406

417-
```{r m5, dependson=mem_square_verbose, dependson=m4}
407+
```{r m5, cache=FALSE}
418408
system.time({
419409
m5 = map_df(1:10, mem_square_verbose)
420410
})
421411
```
422412

423413
Finally, albeit probably unnecessary, we can also prove to ourselves that we've added the three new cases (i.e. for `8:10`) to our cache directory by comparing to what we had previously.
424414

425-
```{r cache_comp}
415+
```{r cache_comp, cache=FALSE}
426416
setdiff(list.files(cache_dir), cached_files)
427417
```
428418

11-funcs-adv/11-funcs-adv.html

Lines changed: 49 additions & 60 deletions
Large diffs are not rendered by default.

11-funcs-adv/11-funcs-adv.pdf

1.1 KB
Binary file not shown.

0 commit comments

Comments
 (0)