Fix no-orphaned-packages false positive for local-path sub-package deps#1855
Conversation
DependencyTree.get_node() failed to find nodes with a pinned reference because nodes are stored by get_id() (repo_url#ref) but looked up by get_unique_key() (repo_url only). This caused resolved_by to be null for transitive deps declared by a local-path sub-package, triggering false no-orphaned-packages audit failures. Add a _nodes_by_unique_key secondary index to DependencyTree so get_node() correctly resolves deps regardless of whether they carry a pinned reference suffix. Fixes #1846 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes a dependency-resolution bookkeeping bug where DependencyTree.get_node() failed to find nodes whose IDs include a pinned #ref suffix, which in turn prevented resolved_by from being set for certain transitive dependencies (notably: remote deps declared by a local-path sub-package) and caused false no-orphaned-packages failures in apm audit --ci.
Changes:
- Add a secondary
_nodes_by_unique_keyindex toDependencyTreeand use it inget_node()to resolve nodes byDependencyReference.get_unique_key()even when the primarynodesdict is keyed byDependencyNode.get_id()(repo_url#ref). - Add unit tests covering
DependencyTree.get_node()behavior with pinned refs. - Add a regression test ensuring remote deps declared by a local-path sub-package are not flagged as orphans when
resolved_byis present.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
src/apm_cli/deps/dependency_graph.py |
Adds unique-key indexing to make get_node() work for pinned refs and prevent downstream resolved_by gaps. |
tests/unit/deps/test_dependency_tree_lookup.py |
New tests verifying get_node() finds nodes stored under repo_url#ref when queried by unique key. |
tests/unit/policy/test_ci_checks.py |
Regression test for the no-orphaned-packages false positive topology from #1846. |
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 1 | 1 | Secondary-index pattern is sound; last-write-wins invariant is implicit and the or-fallback in get_node() is unreachable by all current callers. |
| CLI Logging Expert | 0 | 1 | 2 | Fix is correctly placed in the data layer; three sources.py call sites have no --verbose log when get_node() misses, which is exactly how this bug hid itself. |
| DevX UX Expert | 0 | 0 | 2 | Correctness fix restoring the expected CI user contract with no CLI surface changes; pre-existing error-message gap noted as nit. |
| Supply Chain Security Expert | 0 | 1 | 1 | Fix is sound for the false-positive case; last-write-wins collision in the secondary index can corrupt resolved_by lineage in edge-case topologies. |
| OSS Growth Hacker | 0 | 1 | 0 | CI-blocking false positive for monorepo users; fix has real adoption significance but is missing a CHANGELOG entry to reclaim affected users. |
| Doc Writer | 0 | 1 | 2 | CHANGELOG.md has no entry for this bug fix; all existing docs about no-orphaned-packages and apm audit --ci are accurate and need no correction. |
| Test Coverage Expert | 0 | 2 | 1 | Unit tests correctly trap the get_node() fix; regression test bypasses sources.py; no integration test covers the full install-to-lockfile-to-audit pipeline. |
| Performance Expert | 0 | 2 | 2 | No blocking perf regressions; double-dict pattern is O(1) with negligible memory overhead at APM's dep counts. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Test Coverage Expert] Add integration test for root -> local-path-dep -> pinned-remote-dep covering the full install-to-lockfile-to-audit pipeline --
outcome: missingon a governed-by-policy surface; the existing regression test bypassessources.pyentirely, leaving the actual broken code path with no automated regression guard. - [OSS Growth Hacker + Doc Writer] Add CHANGELOG
[Unreleased]### Fixedentry before merge -- project Keep-a-Changelog convention requires it; omitting it for a CI-blocking false positive affecting the highest-commitment adopter cohort is an adoption risk, not a style issue. - [Python Architect + Supply Chain + Performance] Add inline comment on
add_node()documenting the last-write-wins invariant and the external resolver dedup guarantee; add unit test for same-repo-url two-refs case -- the invariant is maintained by the resolver'squeued_keysdedup, not byadd_node()itself; undocumented and untested implicit behavior is a regression trap for any future caller that bypasses the resolver. - [CLI Logging Expert] Add
--verbosediagnostic log at eachget_node()call site insources.pywhen the result is None -- the silent fallback todepth=1, resolved_by=Noneis exactly how this bug hid itself from users runningapm install --verbose. - [Python Architect + CLI Logging Expert + DevX UX Expert] Remove or document the unreachable
or self.nodes.get(unique_key)fallback inget_node()atdependency_graph.py:105-- all current callers passget_unique_key()-style keys; the fallback impliesget_node()accepts both key formats, actively misleading future readers.
Architecture
classDiagram
direction LR
class DependencyTree {
<<Aggregate>>
+root_package APMPackage
+nodes dict
-_nodes_by_depth dict
-_nodes_by_unique_key dict
+max_depth int
+add_node(node) void
+get_node(unique_key) DependencyNode
+has_dependency(repo_url) bool
}
class DependencyNode {
<<ValueObject>>
+package APMPackage
+dependency_ref DependencyReference
+depth int
+parent DependencyNode
+get_id() str
}
class DependencyReference {
<<ValueObject>>
+repo_url str
+reference str
+get_unique_key() str
+to_canonical() str
}
class APMPackage {
+name str
+version str
}
DependencyTree *-- DependencyNode : contains
DependencyNode *-- DependencyReference : owns
DependencyNode o-- DependencyNode : parent
DependencyNode *-- APMPackage : wraps
note for DependencyTree "PR 1855 adds _nodes_by_unique_key secondary index. Keyed by get_unique_key() (no ref suffix). Fixes get_node() miss on pinned transitive deps."
class DependencyTree:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
A["apm install / apm audit --ci"] --> B["apm_resolver.py\nbuild DependencyTree BFS"]
B --> C["DependencyTree.add_node(node)\ndependency_graph.py:87-95"]
C --> C1["nodes keyed by get_id()\ne.g. prisma/skills#0b8e83c"]
C --> C2["_nodes_by_unique_key keyed by get_unique_key()\ne.g. prisma/skills"]
C1 --> D
C2 --> D
D["integrate.py:57\ndep_key = dep_ref.get_unique_key()"] --> E["DependencyTree.get_node(dep_key)\ndependency_graph.py:97-105"]
E --> F{"_nodes_by_unique_key.get(dep_key)"}
F -->|"hit - fixed path"| G["DependencyNode found\nnode.parent.repo_url = resolved_by"]
F -->|"miss"| H["self.nodes.get(dep_key)\ndead fallback - no caller passes get_id() key"]
H -->|"miss - was only path before fix on pinned deps"| J["return None\nresolved_by absent"]
G --> I["apm.lock.yaml: resolved_by set\n_check_no_orphans passes"]
J --> K["apm.lock.yaml: no resolved_by\n_check_no_orphans false positive #1846"]
Recommendation
The fix is correct and the panel found no blocking defects. Ship after the author adds a single CHANGELOG [Unreleased] ### Fixed bullet -- low effort, required by project convention, and strategically important for reclaiming monorepo CI users who quietly disabled the flag. File two issues immediately after merge: (1) integration test for the full install-to-lockfile-to-audit pipeline covering the sources.py path, which is an outcome: missing gap on a governed-by-policy surface; (2) document and unit-test the last-write-wins invariant in add_node(). The remaining nits (unreachable fallback, verbose diagnostics) are appropriate for a fast-follow PR.
Full per-persona findings
Python Architect
-
[recommended] add_node() silently clobbers
_nodes_by_unique_keywhen two nodes share the same unique_key but differ by reference; invariant is implicit and untested atsrc/apm_cli/deps/dependency_graph.py:92
The invariant 'at most one node per unique_key reaches add_node()' is maintained externally by the resolver (apm_resolver.py:617-626: get_node() pre-check + queued_keys dedup), not by add_node() itself. If add_node() is ever called directly with two nodes sharing the same get_unique_key() but different references, nodes retains both while _nodes_by_unique_key silently holds only the last. The new test suite does not cover this 'same repo_url, two different refs' path.
Suggested: Add a comment documenting the resolver dedup invariant above the _nodes_by_unique_key insert, and add a unit test for the last-write-wins case. -
[nit]
or self.nodes.get(unique_key)fallback inget_node()is dead code with all current callers atsrc/apm_cli/deps/dependency_graph.py:105
All five get_node() call sites pass a key from dep_ref.get_unique_key() (no #ref suffix). The self.nodes fallback only fires when a caller passes a full get_id()-style string like 'owner/repo#sha', and none do. Keeping the fallback implies get_node() accepts both key formats, which is misleading.
Suggested: Simplify to:return self._nodes_by_unique_key.get(unique_key)
CLI Logging Expert
-
[recommended] Silent
get_node()miss insources.pyhas no verbose diagnostic -- exactly how this bug hid itself atsrc/apm_cli/install/sources.py
All three get_node(dep_key) call sites fall back silently to depth=1, resolved_by=None when the node is not found. Had there been a --verbose log, users runningapm install --verbosewould have surfaced the miss immediately. Suggest adding a debug log at each site for the node is None branch. -
[nit]
_nodes_by_unique_keylast-write-wins invariant should have a docstring note atsrc/apm_cli/deps/dependency_graph.py
When add_node() is called twice for two nodes with the same unique_key but different refs, the secondary index silently overwrites. A one-line comment prevents future maintainers from wondering about the semantics. -
[nit]
get_node()or-fallback toself.nodes.get()is undocumented and misleading atsrc/apm_cli/deps/dependency_graph.py:105
The fallback only fires when a caller passes a full get_id() string; no current caller does. An inline comment clarifying when the fallback fires would help readers.
DevX UX Expert
-
[nit] Orphan failure details print raw dep keys with no recovery context at
src/apm_cli/policy/ci_checks.py
When _check_no_orphans fails, it emits a bare list of dep keys with no hint about why or what action to take (edit apm.yml, re-run apm install, investigate a branch merge artifact?). Pre-existing; not introduced by this PR. -
[nit]
get_node()fallback toself.nodes.get(unique_key)is undocumented and unused by all current callers atsrc/apm_cli/deps/dependency_graph.py:105
All current call sites pass get_unique_key() (no #ref suffix), so the fallback is never exercised. Misleads future readers about the method contract.
Supply Chain Security Expert
-
[recommended] Last-write-wins collision in
_nodes_by_unique_keycan corruptresolved_bylineage in edge-case topologies atsrc/apm_cli/deps/dependency_graph.py:92
add_node() unconditionally overwrites the secondary index. If two nodes share the same get_unique_key() but differ in reference (e.g., root declares org/lib with no ref, sub-package declares org/lib@sha_xyz), whichever is added last wins. get_node() then returns the wrong node, incorrectly setting resolved_by in the lockfile. Does not bypass no-orphaned-packages directly (root deps in manifest_keys are exempt), but corrupts the lineage graph, breaking 'apm why' and potentially creating false positives/negatives on future lockfile states. [CEO arbitration: document and test the invariant; do not change to setdefault() -- see dissent notes.] -
[nit]
get_unique_key()does not lowercase repo_url path segment; case-variant declarations produce split keys
build_dependency_unique_key() lowercases the host segment but not the path. GitHub is case-insensitive on paths, so two declarations with different casing produce separate tree/lockfile entries. Pre-existing; not introduced by this PR.
OSS Growth Hacker
- [recommended] No CHANGELOG entry in
[Unreleased]for this CI-blocking fix
CHANGELOG.md [Unreleased] section is empty. This bug silently broke apm audit --ci for any project using a local-path sub-package whose deps carry a pinned #ref -- a topology common in monorepos. The fix deserves a### Fixedbullet so it appears in the next release narrative and can re-engage users who silently disabled the audit flag.
Auth Expert -- inactive
No auth files touched; the fix is a DependencyTree secondary-index bookkeeping change with no effect on token management, credential resolution, or host classification.
Doc Writer
-
[recommended]
CHANGELOG.md [Unreleased]is missing a### Fixedentry for this user-visible bug fix atCHANGELOG.md
The [Unreleased] section is empty. The project's Keep a Changelog convention requires a Fixed entry for every user-visible bug fix. No entry for [BUG] no-orphaned-packages falsely flags remote deps declared by a local-path sub-package (resolved_by never set) #1846 or Fix no-orphaned-packages false positive for local-path sub-package deps #1855 exists anywhere in CHANGELOG.md. Suggested text:apm audit --ci no longer emits false no-orphaned-packages failures for transitive dependencies declared by a local-path sub-package. DependencyTree.get_node() now resolves nodes by repo URL regardless of whether a pinned reference suffix is present. (closes #1846) (#1855) -
[nit]
docs/src/content/docs/reference/baseline-checks.mdno-orphaned-packages section is accurate; no update needed
The documented behavior ('Fails when the lockfile holds a package that the manifest no longer lists') is the correct intent. The bug was a false positive where the implementation diverged from this documented intent; the fix restores alignment. -
[nit]
docs/src/content/docs/reference/lockfile-spec.mdresolved_by description is accurate; no update needed
resolved_byfield description ('repo_url of the parent that pulled this transitive dep. Absent for direct deps.') matches the corrected behavior.
Test Coverage Expert
-
[recommended] Regression test bypasses the actual broken code path -- no integration test for install-to-lockfile-to-audit pipeline at
tests/integration/test_deps_resolver_phase3b.py
test_remote_deps_of_local_path_subpackage_not_orphanedcalls_check_no_orphans()with a hand-crafted lockfile. It does NOT exercisesources.py:298which was actually broken (callsget_node(dep_key)and writesresolved_byfrom the result). If that callsite regresses, the regression test still passes. Greppedtests/integration/-- no test covers the full install + lockfile write + audit path for this topology.
Proof (missing at integration-with-fixtures):tests/integration/test_deps_resolver_phase3b.py::test_local_path_subpackage_pinned_remote_dep_resolved_by_propagated_to_lockfile-- proves: apm install with root -> local-path-dep -> pinned-remote-dep topology writes resolved_by correctly in the lockfile [governed-by-policy]
assert lock.get_dependency('prisma/skills').resolved_by == '_local/agent-config' -
[recommended]
test_remote_deps_of_local_path_subpackage_not_orphanedoutcome could not be run-verified (environment lacked pytest) attests/unit/policy/test_ci_checks.py
Per S7 PROBE RULE, certifyingoutcome: passedat integration-with-fixtures tier on a critical-promise surface requires actually running the test. Test body is correctly formed but runtime pass is unverified. Outcome: unknown.
Proof (unknown at integration-with-fixtures):tests/unit/policy/test_ci_checks.py::test_remote_deps_of_local_path_subpackage_not_orphaned
assert result.passed, f'False positive orphan: {result.details}' -
[nit] Last-write-wins behavior of
_nodes_by_unique_keyis undocumented and has no unit test attests/unit/deps/test_dependency_tree_lookup.py
Ifadd_node()is called twice with the same repo_url but different reference values,_nodes_by_unique_keyretains only the last-added. Greppedtest_dependency_tree_lookup.py-- no test covers this case.
Proof (missing at unit):tests/unit/deps/test_dependency_tree_lookup.py::test_get_node_last_write_wins_when_same_repo_two_refs
tree.add_node(node_sha1); tree.add_node(node_sha2); assert tree.get_node('prisma/skills') is node_sha2
Performance Expert
-
[nit] Second dict memory overhead is negligible at realistic dep counts (~21 KB at N=200)
Both dicts store str->DependencyNode references (not copies). At N=200 deps, ~21 KB extra dict overhead -- less than 0.1% of typical process RSS. -
[nit]
add_node()callsget_unique_key()twice; trivial micro-cleanup possible atsrc/apm_cli/deps/dependency_graph.py
Line 89 callsnode.get_id()which internally callsget_unique_key(), then line 92 calls it again explicitly. ~0.02-0.1 ms total at 200 deps -- noise against per-dep network I/O floor.
Suggested:ukey = node.dependency_ref.get_unique_key(); self.nodes[node.get_id()] = node; self._nodes_by_unique_key[ukey] = node -
[recommended]
get_node()or-chain is correct and O(1) on the hot path; no perf concern
DependencyNode instances are always truthy (plain dataclass, no__bool__), so the or chain cannot false-short-circuit. All current callers passget_unique_key()so the primary lookup always hits post-fix. Net effect: converts a broken path to a correct O(1) lookup. -
[recommended] Last-writer-wins in
_nodes_by_unique_keyis consistent with APM resolution semantics but needs a comment atsrc/apm_cli/deps/dependency_graph.py:92
Resolver dedup (queued_keys) prevents two different-ref deps from reachingadd_node()for the same unique_key in normal operation. But this invariant lives outsideadd_node()and should be documented inline.
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
pypi.org
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "pypi.org"See Network Configuration for more information.
Generated by PR Review Panel for issue #1855 · sonnet46 17M · ◷
Preserve first-wins semantics when multiple pinned refs share a dependency unique key, add the regression trap, and document the audit-ci false-positive fix in the changelog. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 0 | 0 | Secondary index pattern is sound after first-wins preservation; no remaining architecture concerns. |
| CLI Logging Expert | 0 | 0 | 0 | No CLI logging or terminal output surface changed. |
| DevX UX Expert | 0 | 0 | 0 | Fix improves apm audit --ci trust without changing command surface or wording. |
| Supply Chain Security Expert | 0 | 0 | 0 | No supply-chain security regression; transitive/direct classification is more accurate and first-wins avoids silent index overwrite. |
| OSS Growth Hacker | 0 | 0 | 0 | Community bug fix is now visible in CHANGELOG for release notes and contributor credit. |
| Test Coverage Expert | 0 | 0 | 0 | Regression-trap tests cover pinned-ref lookup, first-wins duplicate refs, and the no-orphan local-path transitive topology. |
| Performance Expert | 0 | 0 | 0 | O(1) secondary index has negligible pointer-only memory overhead and no hot-path regression. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Architecture
classDiagram
direction LR
class DependencyTree {
<<Aggregate>>
+nodes dict~str DependencyNode~
+_nodes_by_unique_key dict~str DependencyNode~
+add_node(node) None
+get_node(unique_key) DependencyNode
}
class DependencyNode {
<<Entity>>
+dependency_ref DependencyReference
+get_id() str
}
class DependencyReference {
<<ValueObject>>
+repo_url str
+reference str
+get_unique_key() str
}
DependencyTree *-- DependencyNode : indexes
DependencyNode *-- DependencyReference : dependency_ref
class DependencyTree:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
A[Resolver builds DependencyNode] --> B[DependencyTree.add_node]
B --> C[nodes keyed by repo#ref]
B --> D[_nodes_by_unique_key keyed by repo]
E[Lockfile/source policy asks get_node(repo)] --> F{unique-key index hit?}
F -->|yes| G[resolved_by parent found]
G --> H[no-orphan check treats dep as transitive]
F -->|no| I[legacy nodes fallback]
Recommendation
Ready for maintainer review. All in-scope follow-ups are folded, local validation passed, and GitHub CI is green on 9b04ccc.
Folded in this run
- (panel) Preserve first-wins semantics when multiple pinned refs share a dependency unique key -- resolved in 9b04ccc.
- (panel) Add a CHANGELOG entry for the audit-ci false-positive fix -- resolved in 9b04ccc.
Regression-trap evidence (mutation-break gate)
tests/unit/deps/test_dependency_tree_lookup.py::TestDependencyTreeGetNode::test_get_node_preserves_first_ref_for_same_unique_key-- changedself._nodes_by_unique_key.setdefault(unique_key, node)back to assignment; test FAILED as expected; guard restored.
Lint contract
uv run --extra dev ruff check src/ tests/ and
uv run --extra dev ruff format --check src/ tests/ both silent.
CI
gh pr checks 1855 --repo microsoft/apm --watch completed green on head 9b04ccc (CI run https://github.com/microsoft/apm/actions/runs/27868750847; CodeQL, Merge Gate, NOTICE Drift Check, Spec conformance, and CLA also passed) after 0 CI fix iteration(s).
Mergeability status
Captured from gh pr view 1855 --json mergeable,mergeStateStatus,statusCheckRollup immediately after the last push of this run. The orchestrator aggregates the same fields across every shepherded PR into the saga-end mergeability table.
| PR | head SHA | CEO stance | iters | folds | defers | Copilot rounds | CI | mergeable | mergeStateStatus | notes |
|---|---|---|---|---|---|---|---|---|---|---|
| #1855 | 9b04ccc |
ship_now | 1 | 2 | 0 | 2 | green | MERGEABLE | BLOCKED | pending required review |
Convergence
1 outer iteration(s); 2 Copilot round(s). Final panel verdict: ship_now.
Ready for maintainer review.
Full per-persona findings
Python Architect
No findings.
CLI Logging Expert
No findings.
DevX UX Expert
No findings.
Supply Chain Security Expert
No findings.
OSS Growth Hacker
No findings.
Auth Expert -- inactive
Touched files are dependency graph lookup plus tests; no auth, token, credential, host classification, or authorization surface changed.
Doc Writer -- inactive
Touched runtime/test files and CHANGELOG only; no docs page drift from this internal audit fix.
Test Coverage Expert
No findings.
Performance Expert
No findings.
This panel is advisory. It does not block merge. Re-apply the panel-review label after addressing feedback to re-run.
…es-local-path-reso
…#1871) Add the user-facing PRs merged after the v0.21.0 tag that were missing from the changelog (#1830 Added; #1854, #1856 Fixed), add the external- contributor credit on #1855, and move #1853 out of the released [0.21.0] section into [Unreleased] since it merged after the v0.21.0 tag. No version bump -- everything stays under [Unreleased]. Co-authored-by: danielmeppiel <danielmeppiel@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Description
DependencyTree.get_node()failed to find nodes with a pinned reference because nodes are stored byget_id()(repo_url#ref) but looked up byget_unique_key()(repo_urlonly). This causedresolved_byto be null for transitive deps declared by a local-path sub-package, triggering falseno-orphaned-packagesaudit failures inapm audit --ci.The fix adds a
_nodes_by_unique_keysecondary index toDependencyTreesoget_node()correctly resolves deps regardless of whether they carry a pinned reference suffix.Fixes #1846
Type of change
Testing
New tests:
test_dependency_tree_lookup.py-- verifiesget_node()finds nodes with pinned refstest_remote_deps_of_local_path_subpackage_not_orphaned-- regression test for the exact topology from the issue (root -> local-path -> remote pinned deps)Spec conformance (OpenAPM v0.1)