Try to gracefully handle FLASH operations queue becoming full, instead of silently failing and allow OTA DFU process to be recoverable, in case of a communication error in the DUAL_BANK configuration#377
Merged
hathach merged 11 commits intoadafruit:masterfrom Feb 2, 2026
Conversation
…f silently dropping operations
…t to bootloader to allow the DFU app to recover previously interrupted transfers, instead of reverting to the last working firmware and launching it
hathach
reviewed
Jan 31, 2026
lib/sdk11/components/libraries/bootloader_dfu/dfu_single_bank.c
Outdated
Show resolved
Hide resolved
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request addresses four critical issues in the bootloader's OTA DFU functionality for nRF52 devices, focusing on improving reliability and recoverability during firmware updates.
Changes:
- Fixed a spelling error (RECIEVED → RECEIVED) in variable names
- Implemented graceful handling of full flash operation queues by adding retry logic with SOC event processing
- Corrected the dual-to-single bank bootloader validation logic that was preventing proper bootloader updates
- Added recovery mechanism for interrupted OTA DFU sessions in dual-bank configurations by tracking connection state and forcing re-entry into bootloader mode on reset
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/usb/uf2/uf2cfg.h | Corrected spelling of BOOTLOADER_ADDR_NEW_RECEIVED macro |
| src/usb/msc_uf2.c | Fixed spelling and initialized restart_into_bootloader flag |
| src/pstorage_platform.h | Increased flash command queue size from 10 to 18 to reduce queue full conditions |
| src/main.c | Added OTA connection tracking and bootloader re-entry logic for failed OTA DFU recovery |
| lib/sdk11/components/libraries/bootloader_dfu/dfu_types.h | Added restart_into_bootloader field to dfu_update_status_t structure |
| lib/sdk11/components/libraries/bootloader_dfu/dfu_single_bank.c | Added queue-full retry logic and fixed bootloader validation to check both banks |
| lib/sdk11/components/libraries/bootloader_dfu/dfu_dual_bank.c | Added queue-full retry logic, fixed bootloader validation, and whitespace cleanup |
| lib/sdk11/components/libraries/bootloader_dfu/bootloader.h | Added bootloader_must_reset_to_self() function declaration and proc_soc extern |
| lib/sdk11/components/libraries/bootloader_dfu/bootloader.c | Implemented bootloader re-entry tracking and queue-full retry logic for settings save |
| Makefile | Added DEFAULT_TO_OTA_DFU configuration option (with incorrect flag syntax) |
| CMakeLists.txt | Added DEFAULT_TO_OTA_DFU CMake option with correct syntax |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…o the user application. Was rarely causing locks
Yes, the '-D' was missing Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
hathach
approved these changes
Feb 2, 2026
Member
hathach
left a comment
There was a problem hiding this comment.
perfect thank you. I add some bracket to single if (I change style recently), also rename ota_was_active to ota_was_connected which is more closed. Everything else is great. Thank you again for an great follow up
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request addresses 5 issues that were found while testing:
When trying to queue a new FLASH erase or write operation, sometimes, depending on timing, the queue of operations was full, leading to silent discarding of the operation on 'release' builds, or even bootloader crashes or misbehaviour, if the discarded operation was to update the bootloader configuration.
With this patch, if that condition is detected (the write/erase queue of commands was completely full), wait until space becomes available - This won't fix the case when BLE comms sends firmware packets faster that the bootloader is able to flash them (so, it is still required to use PRN=8), but makes the bootloader behave properly if uploading at fast speeds.
The other thing that was fixed is an error introduced by github copilot (a suggested refactoring) that was incorrect: On the function dfu_bl_image_validate, a fix for switching from a dual_bank to a single_bank was introduced. Basically (https://devzone.nordicsemi.com/f/nordic-q-a/16590/dfu-change-bootloader-to-single-bank), the problem with updating the bootloader from a dual to single bank bootloader is that the new bootloader (single_bank) expects the firmware image to be at the start of BANK 0, while it's actually loaded to BANK1 by the old bootloader (double_bank). After the update the new single bank bootloader verifies that the image was correctly written to flash by comparing itself with the firmware image that it expects to be located in BANK0. Since the actual firmware image is in BANK 1 this check will fail and the new bootloader updates itself with whatever data located in BANK0 using the MBR, and that happens in a loop, preventing the user application to be launched, and the only possible recovery of that situation is by using a JTAG programmer.
The fix is to modify the dfu_bl_image_validate() function run verification on both addresses (BANK0 and BANK1), and if ANY of both banks contains the bootloader, then avoid copying itself
Originally, the code read:
return sd_mbr_command(&sd_mbr_cmd_1) && sd_mbr_command(&sd_mbr_cmd_2);
sd_mbr_command() will return 0 (=NRF_SUCCESS) if the comparision matches, and !=0 if it does not. So, any of both comparisons returning SUCCESS means the dfu_bl_image_validate() will return SUCCESS (=0) as we are dealing with a logical AND operation. Copilot wrongly refactored it as if both comparisons must be SUCCESSful for the function to return success.
This is also fixed by this patch
When a DUAL_BANK bootloader is being used, and an OTA DFU is being carried, and the firmware transfer is interrupted or fails (for any reason), the bootloader is left in a "wait for more firmware packets" state. The only way to recover from such state, and attempt a new firmware transfer, is to send the reset command (that is exactly what the Nordic DFU does in this situation), and it is expected the bootloader to reenter the bootloader mode after a system reset: This exactly what happens in a SINGLE_BANK bootloader, as the main application firmware was erased, and there is no user application to jump to.
But on the DUAL_BANK bootloader case, there is a user application (the last previously working one), so, without this patch, the bootloader was reseting the system and the user application was launched, thus preventing the Nordic DFU app to complete the firmware transfer. This specific case is now handled and the bootloader will reset and restart in OTA DFU mode, allowing the OTA DFU to complete successfully
Allow the bootloader to default to OTA DFU instead of Serial/UF2 DFU (for field deployed devices where no user accessible USB ports are available) so devices are recoverable
Clear pending interrupts and exceptions before launching the application. Sometimes (it is rare but it happens) with Softdevice 6.1.1, the bootloader jumps to the application and the device resets. This seems to be caused by a pending interrupt/exception ... The SDK12+ bootloader cleans them before jumping, make this bootloader do the same
Add ability to use an external PA/LNA to increase BLE range (disabled by default, configurable from the board.h specific to the board targeted)