flightcheck: AGENTS.md guidance updates from PR #87 post-mortem#95
flightcheck: AGENTS.md guidance updates from PR #87 post-mortem#95GrahamMcMynn wants to merge 1 commit into
Conversation
While validating PR #87 (WD-ENV-101 ISU username vs Entra UPN format alignment) end-to-end against a real tenant, three latent bugs were discovered, all from the original May 7 Python scripts commit: 1. BAP env id derivation used Dataverse OrgId (different GUID on most tenants) - gated every BAP-scoped check to a 404. 2. Flow listing endpoint hit api.powerapps.com when the admin flow API actually lives on api.flow.microsoft.com with its own audience token. 3. License SKU matchers were case-sensitive against UPPER_CASE patterns while Graph returns mixed-case values - PRE-001/002/003 falsely FAILED on tenants with valid licenses. Bugs and tests were fixed in PR #87 itself. This PR is the corresponding process update so future agents don't reintroduce the same classes of mistake. Five guidance changes to flightcheck/AGENTS.md: 1. **New top-level section "Power Platform identity gotchas"** - Promotes the BAP-id-vs-Dataverse-OrgId gotcha from the buried PVA subsection (it had been in the Island Gateway section since the file's creation) to a cross-cutting rule that applies to every client that needs a BAP env id. - Adds "BAP returns 404 (not 403) when handed an unknown env id" - prevents misdiagnosing a wrong env id as a permission failure. - Adds the host-audience pairing table for the six API surfaces the kit uses, including the api.flow.microsoft.com / service.flow.microsoft.com//.default pair that bug #2 missed. - Removes the now-duplicate gotcha note from the Island Gateway section, replacing with a back-reference. 2. **New subsection "Sanity-checking a capture before pinning it"** - Lives under "What counts as the same endpoint" in the cardinal rule. - Documents the failure mode that produced bug #2: a 404 from the wrong host was captured and pinned as "expected behavior for Dataverse-only environments." The validated tier was supposed to prevent guessing, but a misdiagnosed capture locks in the wrong contract with full ceremony. - Adds four sanity-check rules: cite vendor docs; re-capture against a second tenant; treat 4xx-on-happy-path as a red flag; don't pin a 404 regression test before verifying the endpoint is correct. 3. **New paragraph "validatable validates shape, not values"** - Lives under "If the tier is validatable." - Documents the failure mode that produced bug #3: the Graph CSDL told us subscribedSkus.skuPartNumber is Edm.String but said nothing about case-of-the-day or bundle expansion. The validatable tier guarantees shape, not vocabulary. - Requires vendor-published reference list citation (licensing-service-plan-reference), case-insensitive matching by default, bundle-expansion awareness, and a unit test that loads the reference table. 4. **Annotated runner.env_id table row** - Existing row said "Power Platform (BAP) environment ID" - true but easy to read past. Now explicitly says NOT the Dataverse OrgId, names the derivation pattern, and cross-references the new identity-gotchas section. 5. **New "step 5a. End-to-end smoke against a real tenant"** - Required PR-description content for any new check that calls a host/audience pair not covered by an existing runner.<client> method: the exact cli command, the output line showing the new checkpoint firing with a terminal status (NOT SKIPPED), and confirmation the run was against a real tenant. - Mocks validate logic; live runs validate that the endpoint exists and the gate chain feeds the check. PR #87's three bugs all had passing unit tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
nehaoss
left a comment
There was a problem hiding this comment.
Excellent post-mortem documentation — this is the kind of institutional knowledge that prevents repeat mistakes. Comments:
-
Power Platform identity gotchas section: The host ↔ audience table is extremely valuable. The \�pi.flow.microsoft.com\ vs \service.powerapps.com\ token mismatch being called out as 'the classic symptom' is exactly the kind of pattern documentation that saves debugging time.
-
Cassette sanity-checking section: Strong guidance. The requirement to 'cite the vendor doc URL' and 're-capture against a second tenant' for anomalous statuses directly addresses the root cause of PR #87's issues. The principle '4xx from a happy-path call is a red flag, not a feature' is well stated.
-
\�alidatable\ validates shape, not values: The SKU matching example (Teams delivered via O365_BUSINESS_PREMIUM, ENTERPRISEPACK, etc.) is concrete and actionable. Good that it explicitly calls out case-insensitive matching.
-
End-to-end smoke requirement (5a): The requirement that SKIPPED ≠ exercised is an important distinction. One suggestion: consider specifying a minimum bar for what 'real tenant' means — e.g., should it be a non-production tenant to avoid accidental impact, or any tenant with the relevant resources?
-
runner.env_id clarification: The expanded description in the Runner attributes table is a high-value one-line change that prevents future confusion.
Summary
Process update prompted by the post-mortem of PR #87 (WD-ENV-101 Workday ISU username vs Entra UPN format alignment). While validating that PR end-to-end against a real tenant, three latent bugs surfaced — all from the original May 7 Python scripts commit, all with passing unit tests:
derive_environment_id()returned the Dataverse OrgId, not the BAP env id (different GUIDs on most tenants)get_flows()hitapi.powerapps.comwhen the admin flow API lives onapi.flow.microsoft.comwith a separate audience tokeninchecks against UPPER_CASE patterns; Graph returns mixed caseThe bugs themselves are fixed in PR #87. This PR is the corresponding guidance update so future agents don't reintroduce the same classes of mistake.
What this PR changes
Five additions to
solutions/ess-maker-skills/scripts/flightcheck/AGENTS.md:New top-level section "Power Platform identity gotchas" — promotes the BAP-id-vs-Dataverse-OrgId gotcha from the buried PVA subsection to a cross-cutting rule. Adds the host↔audience pairing table for all six API surfaces the kit uses (including the
api.flow.microsoft.com+service.flow.microsoft.com//.defaultpair that bug 2 missed). Also adds "BAP returns 404 (not 403) when handed an unknown env id" to prevent misdiagnosing wrong-id as permission failure.New subsection "Sanity-checking a capture before pinning it" (under "What counts as the same endpoint") — documents the failure mode that produced bug 2. The cardinal rule was supposed to prevent guessing, but a misdiagnosed capture locks in the wrong contract with full ceremony. Adds four sanity rules: cite vendor docs; re-capture against a second tenant; treat 4xx-on-happy-path as a red flag; don't pin a 404 regression test before verifying the endpoint is correct.
New paragraph "
validatablevalidates shape, not values" (under "If the tier isvalidatable") — documents the failure mode that produced bug 3. CSDL guarantees field shape, not value vocabulary. Requires vendor-published reference list citation, case-insensitive matching by default, bundle-expansion awareness, and a unit test that loads the reference table.Annotated
runner.env_idtable row — was "Power Platform (BAP) environment ID" (true but easy to read past). Now explicitly says NOT the Dataverse OrgId, names the derivation pattern, and cross-references the new identity-gotchas section.New "step 5a. End-to-end smoke against a real tenant" — required PR-description content for any new check that calls a host/audience pair not covered by an existing
runner.<client>method: the exact CLI command, the output line showing the new checkpoint firing with a terminal status (NOT SKIPPED), and confirmation the run was against a real tenant. Mocks validate logic; live runs validate that the endpoint exists. PR flightcheck: add WD-ENV-101 Workday ISU username vs Entra UPN format alignment #87's three bugs all had passing unit tests precisely because the mocks themselves were wrong.Why this is its own PR
Separation of concerns: PR #87 contains the code fixes plus the test infrastructure that aligns with them (including INDEX.md row corrections — those are tightly coupled to the code change). This PR contains the aspirational rules for future work — distinct review concern, distinct audience (process-minded reviewers vs. code-minded reviewers).
Out of scope
flightcheck_pp_admin.yamlagainst the correctapi.flow.microsoft.comhost (noted as a TODO in INDEX.md by PR flightcheck: add WD-ENV-101 Workday ISU username vs Entra UPN format alignment #87).tests/AGENTS.md(the companion test-authoring file) — cross-references in flightcheck/AGENTS.md make the new rules discoverable from there, but a dedicated mirror could come in a follow-up if reviewers prefer.Related: #87