Hotfix optout of bake builder in CI#16464
Conversation
|
@janbrasna incredible! All of those green tests are great to see :) Thank you for your work on this. We are very grateful. |
|
Thank you so much @janbrasna - above and beyond! You are a star. I'm happy for us to merge this now, with the caveat that support for |
stevejalim
left a comment
There was a problem hiding this comment.
Thank you again @janbrasna!
This is a good stopgap to keep us moving, and there's hope that before Docker makes bake the default behaviour, they ensure the hanging GHA issue is resolved first.
If we end up back in the same situation, then moving to a build pattern where we don't use docker compose for building unit test images (nor release images) looks like the way forward. Indeed, we might do well to look at that approach anyway, if there are efficiency gains there
| make clean test-image | ||
| CONTAINER_ID=$(docker ps -alq) | ||
| docker cp $CONTAINER_ID:/app/python_coverage . | ||
| timeout-minutes: 30 |
There was a problem hiding this comment.
No harm in that - our builds are done well before 30 mins
There was a problem hiding this comment.
The idea is — if it blows up again, let everyone know by failing the CI while they're still at work, not six hours later when the runner times out.
(Once this gets the container deployed in prod build, I'm porting this over to fxc.)
There was a problem hiding this comment.
… aaand it's built, incl. gotoprod pipelines.
This is the failing CI:
"bin/docker-compose.sh" build --pull release
#1 [internal] load local bake definitions
#1 reading from stdin 599B done
This is the legacy kicking in now in the green CIs:
"bin/docker-compose.sh" build --pull release
#0 building with "builder-218dd43c-893e-45d1-b3f4-15b44da5b582" instance using docker-container driver
|
When I get more time to wrestle with dep mgmt on the runner itself, I want to confirm a PoC that just updating to their v2.38 branch resolves this; as a confirmation for shipping a GH image update with that version bump would resolve that for everyone. (I think the current notion is that this issue is already fixed in some of the released v2.38–39 versions, so they are making amends to remove the flag in v2.40 and beyond… But the issue is open, is actively being investigated, and if any of the already landed patches mentioned in the thread are confirmed to be resolving this, I think it's safe to just ride the trains — as any version that would not take this flag is understood to have it fixed at the same time — so I think the fact the ticket is still open to make this confirmation on a very reproducible public STR would help the confidence in that. Basically just this: bedrock/.github/workflows/pull_request_tests.yml Lines 17 to 19 in aeb6324 gets us green again, if need be. |
One-line summary
Disabling the new behavior until GHA update compose to a patched version in the runner image and finish deploying to 100% demography.
Significant changes and points to review
actions/runner-images#12669 made a breaking update that enabled
COMPOSE_BAKEby default. That also had followup fixes released that are necessary for successful runs, however the current GHA runner images deployed ship a version that leaves the build process hanging indefinitely. docker/compose#12998We need to disable it until a new runner image ships with compose bumped further, as it is believed a version already released some time ago covered this issue and it's expected it should go away here in CI too once GH updates the runner to a more recent tool version. Also note it's already reported as deprecated, for
COMPOSE_BAKE=falseto be ignored/removed completely in an upcoming release.Issue / Bugzilla link
actions/runner-images#12685
(Supersedes #16463)
Testing
/actions/runs/16663214340/job/47164637902 💚