feature/DAT-1729 by michalhuryn-montrose · Pull Request #17 · NYU-Databrary/databraryr

michalhuryn-montrose · 2026-05-12T16:24:55Z

merge after 1725 and 1728,
needed decision on scaffold_session()

… in Databrary

… operations and error handling

michalhuryn-montrose

@parmatys please respond to following findings 😄

michalhuryn-montrose · 2026-05-18T11:41:50Z

+  remaining <- partial$input[to_redo]
+  args <- list(...)
+  args[[input_arg]] <- remaining
+
+  new_rows <- do.call(fn, args)


When a resumed run fast-fails, $partial only has the rows we just retried, not the originals. So if someone wants to fix-and-resume in a loop, they can't just keep passing $partial back in, they'd quietly lose the rows that already succeeded the first time. Could we mention that in the README resume section (and ?resume_bulk) with the splice one-liner?

Behavior fix: when the nested bulk call fast-fails, we now merge that inner partial into the outer tibble and re-throw databraryr_bulk_error with partial set to that full tibble, so repeated e$partial in a loop no longer drops rows that already succeeded

michalhuryn-montrose · 2026-05-18T11:45:31Z

+preflight_session_duplicates <- function(state, vol_id, session_id, vb, rq) {
+  filenames <- basename(state$input)
+  dupes <- check_duplicate_files_in_session(
+    vol_id = vol_id,
+    session_id = session_id,
+    filenames = filenames,
+    vb = vb,
+    rq = rq
+  )
+  if (is.null(dupes)) {
+    return(state)
+  }
+
+  exists_lookup <- stats::setNames(dupes$exists, dupes$filename)
+  is_dupe <- !is.na(exists_lookup[filenames]) &
+    unname(exists_lookup[filenames])
+  is_dupe[is.na(is_dupe)] <- FALSE
+
+  state$status[is_dupe] <- "skipped"
+  state$reason[is_dupe] <- "duplicate"
+  state
+}
+
+# Internal: duplicate filenames in folder preflight (mirrors session).
+#' @noRd
+preflight_folder_duplicates <- function(state, vol_id, folder_id, vb, rq) {
+  filenames <- basename(state$input)
+  dupes <- check_duplicate_files_in_folder(
+    vol_id = vol_id,
+    folder_id = folder_id,
+    filenames = filenames,
+    vb = vb,
+    rq = rq
+  )
+  if (is.null(dupes)) {
+    return(state)
+  }
+
+  exists_lookup <- stats::setNames(dupes$exists, dupes$filename)
+  is_dupe <- !is.na(exists_lookup[filenames]) &
+    unname(exists_lookup[filenames])
+  is_dupe[is.na(is_dupe)] <- FALSE
+
+  state$status[is_dupe] <- "skipped"
+  state$reason[is_dupe] <- "duplicate"
+  state
+}


Aside from the call to check_duplicate_files_in_session vs check_duplicate_files_in_folder and the id arg name, these two are line-for-line identical. If we tweak the "duplicate detection" semantics later, say to also stash the existing asset id in reason, we'd have to remember to change both. Could we collapse them into a single preflight_duplicates(state, checker, ...) that takes the check function as an argument and have bulk_files.R pick the right checker, what do you think?

Collapsed the two helpers into preflight_duplicates(state, checker, ...) and wired bulk_upload_files to pass the appropriate checker

…rations; refactor duplicate checking functions

michalhuryn-montrose · 2026-05-19T07:06:38Z

LGMT

feat: add bulk operations for managing records, folders, and sessions…

31bbc5f

… in Databrary

parmatys force-pushed the feature/DAT-1729 branch from 2f9ed25 to 31bbc5f Compare May 18, 2026 10:33

parmatys marked this pull request as ready for review May 18, 2026 10:42

docs: expand README and Rmd to include detailed documentation on bulk…

3246a33

… operations and error handling

michalhuryn-montrose commented May 18, 2026

View reviewed changes

docs: enhance README with looping fix-and-resume example for bulk ope…

6bc7fcd

…rations; refactor duplicate checking functions

michalhuryn-montrose merged commit 316cd0e into development May 19, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature/DAT-1729#17

feature/DAT-1729#17
michalhuryn-montrose merged 3 commits into
developmentfrom
feature/DAT-1729

michalhuryn-montrose commented May 12, 2026

Uh oh!

michalhuryn-montrose left a comment

Uh oh!

michalhuryn-montrose May 18, 2026

Uh oh!

parmatys May 18, 2026

Uh oh!

michalhuryn-montrose May 18, 2026

Uh oh!

parmatys May 18, 2026

Uh oh!

michalhuryn-montrose commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

michalhuryn-montrose commented May 12, 2026

Uh oh!

michalhuryn-montrose left a comment

Choose a reason for hiding this comment

Uh oh!

michalhuryn-montrose May 18, 2026

Choose a reason for hiding this comment

Uh oh!

parmatys May 18, 2026

Choose a reason for hiding this comment

Uh oh!

michalhuryn-montrose May 18, 2026

Choose a reason for hiding this comment

Uh oh!

parmatys May 18, 2026

Choose a reason for hiding this comment

Uh oh!

michalhuryn-montrose commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants