Skip to content

Update semver logic: one patch version per checklist#7458

Merged
timszot merged 11 commits into
mainfrom
Rory-OnePatchVersionPerChecklist
Feb 1, 2022
Merged

Update semver logic: one patch version per checklist#7458
timszot merged 11 commits into
mainfrom
Rory-OnePatchVersionPerChecklist

Conversation

@roryabraham

@roryabraham roryabraham commented Jan 29, 2022

Copy link
Copy Markdown
Contributor

Details

This changes the deploy paradigm so that we no longer bump the PATCH version or run a staging deploy when a checklist is locked. Instead, we bump the PATCH version when the checklist is first created.

Fixed Issues

$ #7166

Tests

I tested commands locally and added automated unit tests where possible.

  • Verify that no errors appear in the JS console

QA Steps

  1. Close the deploy checklist.

  2. The app should have it's PATCH version bumped, and a staging deploy should occur./

  3. The new checklist should be created with the new PATCH version.

  4. Lock the checklist when there is no staging deploy running.

  5. The awaitStagingDeploys action should complete quickly, and OSBotify should comment on the deploy checklist:

    🚀 All staging deploys are complete, @Expensify/applauseleads please begin QA on version https://github.com/Expensify/App/releases/tag/new-build-version 🚀

  6. Unlock the checklist and merge another PR.

  7. While the staging deploy for the other PR is running, lock the checklist again.

  8. The awaitStagingDeploys action should poll the GH API every 90 seconds until the staging deploy completes.

  9. Once the staging deploy completes, the action should complete and OSBotify should comment on the deploy checklist with the name comment as above, using the new build version.

  • Verify that no errors appear in the JS console

@roryabraham roryabraham added the InternalQA This pull request required internal QA label Jan 29, 2022
@roryabraham roryabraham self-assigned this Jan 29, 2022
@roryabraham

Copy link
Copy Markdown
Contributor Author

Planned changes to this SO:

New Expensify uses Github Actions for CI/CD. At a high level, the process works like this:

1) When a PR is merged to main, a new `BUILD` version is created (`1.0.2-72` becomes `1.0.2-73`) and it is typically deployed to staging immediately.
2) As PRs are deployed, they are accumulated in a special tracking issue in Expensify/App labeled `StagingDeployCash`. That issue contains:
    1) The latest version on staging
    2) A link to a comparison between the latest staging version and the previous production version.
    3) A checklist of all PRs that were merged and deployed to staging since the last production deploy
    4) A checklist of any deploy blockers found by Applause during QA.
+3) When Applause is ready to begin QA, they add the `🔐LockCashDeploys🔐` label to the `StagingDeployCash`. That will prevent any newly merged PRs from being deployed to staging.
-3) When Applause is ready to begin QA, they add the `🔐LockCashDeploys🔐` label to the `StagingDeployCash`. That will:
-    1) Create a new `PATCH` version (`1.0.2-73` becomes `1.0.3-0`)
-    2) Prevent any newly merged PRs from being deployed to staging.
4) If issues are found on staging that are not present on production, they become deploy blockers.
    1) Any issue or PR labelled `DeployBlockerCash` will be auto-assigned to an Expensify engineer as an `Hourly`.
    2) To fix a `DeployBlocker`, there are two options:
        1) Create a PR to fix the deploy blocker, and give it the `CherryPickStaging` label. When merged, that PR will be CP'd to staging.
        2) Remove the lock label from the `StagingDeployCash` checklist, and "open the floodgates" by deploying the main branch to staging. This is the nuclear option that deploys any and all new code to staging and invalidates the current QA cycle.
4) When Applause finishes QA, one of two things can happen:
    1) If there are issues found during QA, they must wait for Expensify engineers to fix the deploy blockers and proceed with the production deploy.
    2) If there are no issues found during QA, the assigned mobile-deployer for the week will close the `StagingDeployCash` issue with a comment that includes the `:shipit:` emoji. That will:
        1) Deploy the latest staging version to production
-       2) Create a new `StagingDeployCash` checklist with all the PRs that were merged while the `StagingDeployCash` was locked, thus restarting the process.
+       2) Create a new `StagingDeployCash` checklist with all the PRs that were merged while the `StagingDeployCash` was locked, thus restarting the process. At this time, we will bump the `PATCH` version (`1.0.2-73` becomes `1.0.3-0`), so each checklist is associated with a unique `PATCH` version.

Applause does daily QA Monday through Friday, so we'll do a production deploy Monday trough Thursday when no issues are found during QA. As per company policy, we typically don't do production deploys on Fridays, but the deployer can decide to do a deploy regardless [if nedeed](https://stackoverflow.com/c/expensify/questions/10635).


**Note:** There is currently no process to cherry-pick hotfixes to production.


function run() {
let currentStagingDeploys = [];
return promiseDoWhile(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW our async/await ban really kills me here, in my opinion this would be much cleaner written like so:

const _ = require('underscore');
const GitHubUtils = require('../../libs/GithubUtils');

async function run() {
    let currentStagingDeploys = [];
    do {
        const response = await GitHubUtils.octokit.actions.listWorkflowRuns({
            owner: GitHubUtils.GITHUB_OWNER,
            repo: GitHubUtils.APP_REPO,
            workflow_id: 'platformDeploy.yml',
            event: 'push',
        });
        currentStagingDeploys = _.filter(response.data.workflow_runs, workflowRun => workflowRun.status !== 'completed');
        
        console.log(
            _.isEmpty(currentStagingDeploys)
                ? 'No current staging deploys found'
                : `Found ${currentStagingDeploys.length} staging deploy${currentStagingDeploys.length > 1 ? 's' : ''} still running...`,
        );
        
        await new Promise(resolve => setTimeout(resolve, GitHubUtils.POLL_RATE * 9))
    } while(!_.isEmpty(currentStagingDeploys));
}

if (require.main === module) {
    run();
}

module.exports = run;

@roryabraham roryabraham marked this pull request as ready for review January 31, 2022 20:08
@roryabraham roryabraham requested a review from a team as a code owner January 31, 2022 20:08
@MelvinBot MelvinBot requested review from timszot and removed request for a team January 31, 2022 20:09
@roryabraham roryabraham requested a review from a team January 31, 2022 20:10
@roryabraham

Copy link
Copy Markdown
Contributor Author

This is a pretty large change so I'm going to pull in a second reviewer here.

@MelvinBot MelvinBot requested review from pecanoro and removed request for a team January 31, 2022 20:10
@roryabraham

Copy link
Copy Markdown
Contributor Author

Also, a note for @timszot and @pecanoro – all the index.js files in this PR are compiled files and so should be ignored in reviews.

pecanoro
pecanoro previously approved these changes Jan 31, 2022

@pecanoro pecanoro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good but I left a comment of something I don't fully understand

Comment thread .github/workflows/lockDeploys.yml Outdated
@roryabraham

Copy link
Copy Markdown
Contributor Author

@pecanoro's question led me to realize there was another potential race condition hiding here that's now solved. Before my last two commits, this was possible:

  1. Merge a pull request.
  2. Checklist is not yet locked, so we start bumping the app version and getting ready to start a staging deploy in preDeploy.yml. It takes a couple minutes to bump the app version and update the staging branch due to delays starting workflows with the triggerWorkflowAndWait action.
  3. Checklist is locked.
  4. Before the staging deploy from step 2 actually begins, the lockDeploys workflow runs the awaitStagingDeploys action, and finds no active platformDeploy.yml staging deploys, so it proceeds and tells Applause to begin QA.
  5. After that, step 2 completes and a new staging deploy begins. This is bad because Applause may have already started QA 💥

To resolve this, awaitStagingDeploys now waits for all platformDeploy.yml and preDeploy.yml workflow runs to complete before proceeding.

@roryabraham roryabraham requested a review from pecanoro January 31, 2022 22:08

@timszot timszot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@timszot timszot merged commit d74f7a4 into main Feb 1, 2022
@timszot timszot deleted the Rory-OnePatchVersionPerChecklist branch February 1, 2022 20:35
@MelvinBot

Copy link
Copy Markdown
Contributor

Triggered auto assignment to @pecanoro (InternalQA), see https://stackoverflow.com/c/expensify/questions/5042 for more details.

@botify

botify commented Feb 1, 2022

Copy link
Copy Markdown

@timszot looks like this was merged without passing tests. Please add a note explaining why this was done and remove the Emergency label if this is not an emergency.

@OSBotify

OSBotify commented Feb 1, 2022

Copy link
Copy Markdown
Contributor

✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release.

@timszot timszot removed the Emergency label Feb 1, 2022
@timszot

timszot commented Feb 1, 2022

Copy link
Copy Markdown
Contributor

Not sure why this got flagged as Emergency as the tests show as passed for the latest commit.

@roryabraham

roryabraham commented Feb 2, 2022

Copy link
Copy Markdown
Contributor Author

Okay, doing the internal QA steps slightly out of order here:

  • Close the deploy checklist
  • The app should have its PATCH version bumped, and a staging deploy should occur. ✅ https://github.com/Expensify/App/actions/runs/1785543922
  • The new checklist should be created with the new PATCH version.
  • Lock the checklist with a staging deploy still running.
  • The awaitStagingDeploys action should poll the GH API every 90 60 seconds until the staging deploy completes.
  • Once the staging deploy completes, the action should complete and OSBotify should comment on the deploy checklist with the name comment as above, using the new build version.

@OSBotify

OSBotify commented Feb 2, 2022

Copy link
Copy Markdown
Contributor

🚀 Deployed to staging by @timszot in version: 1.1.35-0 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 success ✅
🕸 web 🕸 success ✅

@OSBotify

OSBotify commented Feb 4, 2022

Copy link
Copy Markdown
Contributor

🚀 Deployed to production by @sketchydroide in version: 1.1.35-1 🚀

platform result
🤖 android 🤖 success ✅
🖥 desktop 🖥 success ✅
🍎 iOS 🍎 success ✅
🕸 web 🕸 success ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

InternalQA This pull request required internal QA

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants