Add standalone errata-workflow agent with MCP tools#594
Conversation
Convert the supervisor's ErratumHandler into a standalone BeeAI workflow agent that communicates exclusively through MCP tools. This includes: - Errata models in ymir/common/models.py (ErratumBuild, TransitionRuleSet, RHELVersion, ErratumPushDetails, etc.) - 9 new errata MCP tools in ymir/tools/privileged/errata.py (transition rules, build maps, stage push, state changes, ownership, comments) - CreateJiraIssueTool in ymir/tools/privileged/jira.py - All tools registered in gateway.py - errata_workflow_agent.py with full workflow: fetch erratum, check attention flags, verify ownership, route by status, advance through states, and verify product listings - compose.yaml service and Makefile target for standalone execution - SKILL.md documenting the complete workflow Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Code Review
This pull request introduces a standalone Errata Workflow Agent using the BeeAI framework to manage the lifecycle of errata, along with several new MCP tools for interacting with the Errata Tool and Jira. The review feedback identifies several critical runtime issues and robustness improvements. These include a potential TypeError in _get_rel_prep_lookup from accessing list data incorrectly, potential IndexErrors when handling empty API responses or HTML tables, and a logical flaw in the agent where failed JIRA fetches could bypass ownership verification. Additionally, improvements are suggested for BeautifulSoup class extraction and Jira Cloud/Server compatibility when resolving user identifiers.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| async def _get_user_account_id(session: Any, headers: dict, email: str) -> str: | ||
| """Resolve a user email to a Jira account ID / name.""" | ||
| jira_base = os.getenv("JIRA_URL") | ||
| url = urljoin(jira_base, "rest/api/3/user/search") | ||
| try: | ||
| async with session.get(url, params={"query": email}, headers=headers) as response: | ||
| response.raise_for_status() | ||
| users = await response.json() | ||
| except aiohttp.ClientError as e: | ||
| raise ToolError(f"Failed to search for user {email}: {e}") from e | ||
|
|
||
| matches = [u for u in users if u.get("emailAddress") == email] | ||
| if len(matches) == 0: | ||
| raise ToolError(f"No JIRA user with email {email}") | ||
| if len(matches) > 1: | ||
| raise ToolError(f"Multiple JIRA users with email {email}") | ||
|
|
||
| user = matches[0] | ||
| return user.get("name") or user["accountId"] or user.get("displayName") | ||
|
|
There was a problem hiding this comment.
Refactor _get_user_account_id to _get_user_identifier to return a tuple of (field_name, value) (e.g., ("accountId", account_id) or ("name", username)). This ensures compatibility with both Jira Cloud (which uses accountId) and Jira Server (which uses name/username), and handles cases where emailAddress is hidden in Jira Cloud search responses.
async def _get_user_identifier(session: Any, headers: dict, email: str) -> tuple[str, str]:
"""Resolve a user email to a Jira identifier (accountId or name)."""
jira_base = os.getenv("JIRA_URL")
url = urljoin(jira_base, "rest/api/3/user/search")
try:
async with session.get(url, params={"query": email}, headers=headers) as response:
response.raise_for_status()
users = await response.json()
except aiohttp.ClientError as e:
raise ToolError(f"Failed to search for user {email}: {e}") from e
matches = [u for u in users if u.get("emailAddress") == email]
if not matches and len(users) == 1:
matches = users
if len(matches) == 0:
raise ToolError(f"No JIRA user with email {email}")
if len(matches) > 1:
raise ToolError(f"Multiple JIRA users with email {email}")
user = matches[0]
if "accountId" in user and user["accountId"]:
return "accountId", user["accountId"]
val = user.get("name") or user.get("key") or user.get("displayName")
if not val:
raise ToolError(f"Could not find a valid identifier for user {email}")
return "name", valThere was a problem hiding this comment.
We aren't using Jira Server anymore, but the " cases where emailAddress is hidden in Jira Cloud search responses " par is worth considering
| state.result = WorkflowResult( | ||
| status=f"Changed ownership of erratum {erratum['id']} to Ymir bot, re-processing", | ||
| reschedule_in=0, | ||
| ) | ||
| return Workflow.END |
There was a problem hiding this comment.
If some related JIRA issues fail to fetch, state.related_issues will be incomplete. If jira_issues is not empty but state.related_issues is empty (or missing some issues), all_owned will evaluate to True (since all([]) is True). This can lead to Ymir incorrectly changing ownership of the erratum without verifying all issues. We should ensure that we successfully fetched all issues before verifying ownership.
# Check if Ymir owns all related issues
if len(state.related_issues or []) != len(jira_issues):
state.result = await _flag_attention(
state,
f"Failed to fetch details for all related JIRA issues: {jira_issues}. "
"Cannot verify ownership safely."
)
return Workflow.END
all_owned = all(
_get_assigned_team(issue) == JIRA_JOTNAR_TEAM
for issue in (state.related_issues or [])
)There was a problem hiding this comment.
Not sure about the ownership altogether. IMO, the agent shouldn't meddle with ownership at all..
There was a problem hiding this comment.
I agree with you here. The agent IMO should not.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
for more information, see https://pre-commit.ci
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Mirrors the JIRA_ALLOW_STATUS_CHANGES pattern from issue verification agent. Erratum state changes are skipped unless the env var is explicitly set to true, preventing accidental state transitions in production. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The errata.py updates left old code alongside new robustness improvements, causing ruff and Python syntax failures. Also fix line-too-long and unused variable in errata_workflow_agent.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Replace _get_user_account_id with _get_user_identifier that returns a (field_key, value) tuple to properly support both Jira Server and Cloud. Remove leftover duplicate assignee block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
beautifulsoup4, lxml, and requests-gssapi are imported by errata.py but were only listed in the root requirements.txt, not in the ymir-tools sub-package requirements used by CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SearchJiraIssuesTool returns JSONToolOutput(result=list), so
output.result is already the issues list — no .get("issues") needed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| For each JIRA issue key in the erratum's `jira_issues` list, fetch full issue details using `get_jira_details`. Store all issue data for later checks. | ||
|
|
||
| ### Step 4: Check Ownership | ||
| Verify the erratum is owned by the Ymir bot (`jotnar-bot@IPA.REDHAT.COM`). If not: |
There was a problem hiding this comment.
If people are to use this as a skill, then this check should not be here. Since it most likely originates from the workflow in errata_workflow_agent.py then it has to be removed from there.
There was a problem hiding this comment.
Or it should be configurable.
There was a problem hiding this comment.
Indeed. Just as above, the ownership is not a question anymore in the agent.
| **Unknown blocking rules:** | ||
| - Flag for human attention with details of blocking rules | ||
|
|
||
| ### Step 7: Verify Product Listings (REL_PREP only) |
There was a problem hiding this comment.
Is this really something that QEs do? I would expect this to prevent errata from transitioning further, and that is the fact that should be flagged for need of attention.
| Advances errata through states (NEW_FILES → QE → REL_PREP), handles stage pushes, | ||
| CAT test timeouts, product listing verification, and flagging for human attention. | ||
| Communicates exclusively through MCP tools. | ||
| """ |
There was a problem hiding this comment.
We did not put module docstrings anywhere else in the project. If documentation is required, then it should be elsewhere.
| WAIT_DELAY = 20 * 60 # 20 minutes | ||
| POST_PUSH_TESTING_TIMEOUT = timedelta(hours=3) | ||
| POST_PUSH_TESTING_TIMEOUT_STR = "3 hours" | ||
| ERRATA_JOTNAR_BOT_EMAIL = "jotnar-bot@IPA.REDHAT.COM" |
There was a problem hiding this comment.
This should be renamed to Ymir. Actually please verify all the mentions of Jotnar are replaced with Ymir.
| other_build_map = ErratumBuildMap.model_validate(other_build_map_data) | ||
| prev_build = other_build_map.root[package] | ||
|
|
||
| is_matched, comment = compare_file_lists(cur_build, prev_build, prev_erratum_id) |
There was a problem hiding this comment.
I had comment about this in the skill. So just pointing it out here as well.
| state.result = WorkflowResult( | ||
| status=f"Changed ownership of erratum {erratum['id']} to Ymir bot, re-processing", | ||
| reschedule_in=0, | ||
| ) | ||
| return Workflow.END |
There was a problem hiding this comment.
I agree with you here. The agent IMO should not.
| @staticmethod | ||
| def from_str(version_string: str) -> "RHELVersion | None": | ||
| version_string = version_string.strip().upper() | ||
| pattern = r"RHEL-(\d+)\.(\d+)(?:\.(\d+))?\.([^\d].*)$" |
There was a problem hiding this comment.
This pattern is also defined in ymir/common/version_utils.py. There is overlap between what this model is doing and what is implemented there. Perhaps it would be better to drop this model and use what is there.
| from beeai_framework.emitter import Emitter | ||
| from beeai_framework.tools import JSONToolOutput, Tool, ToolError, ToolRunOptions | ||
| from beeai_framework.tools import JSONToolOutput, StringToolOutput, Tool, ToolError, ToolRunOptions | ||
| from bs4 import BeautifulSoup, Tag # type: ignore |
There was a problem hiding this comment.
Could we use the errata API and drop the bs4?
There was a problem hiding this comment.
There is also library implementing the API, so that can be used as well.
There was a problem hiding this comment.
Also why is the #type: ignore here? What are we suppressing?
| for issue in issues | ||
| ] | ||
| return JSONToolOutput(result=out) | ||
| return JSONToolOutput(result={"issues": out}) |
There was a problem hiding this comment.
This failed for you in a container or why was this change done? Anton added this tool like 3 months ago, i migrated it 2 months ago and now i am thinking whether no one noticed that it is failing or you are experiencing fail that none of us has.
Convert the supervisor's ErratumHandler into a standalone BeeAI workflow agent that communicates exclusively through MCP tools. This includes:
TODO:
packit/packit.dev.Fixes
Related to
Merge before/after
RELEASE NOTES BEGIN
Packit now supports automatic ordering of ☕ after all checks pass.
RELEASE NOTES END