Skip to content

flightcheck: AGENTS.md guidance updates from PR #87 post-mortem#95

Open
GrahamMcMynn wants to merge 1 commit into
mainfrom
docs/flightcheck-agents-guidance-from-pr87-postmortem
Open

flightcheck: AGENTS.md guidance updates from PR #87 post-mortem#95
GrahamMcMynn wants to merge 1 commit into
mainfrom
docs/flightcheck-agents-guidance-from-pr87-postmortem

Conversation

@GrahamMcMynn

Copy link
Copy Markdown
Collaborator

Summary

Process update prompted by the post-mortem of PR #87 (WD-ENV-101 Workday ISU username vs Entra UPN format alignment). While validating that PR end-to-end against a real tenant, three latent bugs surfaced — all from the original May 7 Python scripts commit, all with passing unit tests:

Bug Symptom Root cause
BAP env id derivation Every BAP-scoped check 404'd on most tenants derive_environment_id() returned the Dataverse OrgId, not the BAP env id (different GUIDs on most tenants)
Flow listing endpoint EXT-001 + Workday block silently gated out get_flows() hit api.powerapps.com when the admin flow API lives on api.flow.microsoft.com with a separate audience token
License SKU matching PRE-001/002/003 FAILED on tenants with valid licenses Case-sensitive in checks against UPPER_CASE patterns; Graph returns mixed case

The bugs themselves are fixed in PR #87. This PR is the corresponding guidance update so future agents don't reintroduce the same classes of mistake.

What this PR changes

Five additions to solutions/ess-maker-skills/scripts/flightcheck/AGENTS.md:

  1. New top-level section "Power Platform identity gotchas" — promotes the BAP-id-vs-Dataverse-OrgId gotcha from the buried PVA subsection to a cross-cutting rule. Adds the host↔audience pairing table for all six API surfaces the kit uses (including the api.flow.microsoft.com + service.flow.microsoft.com//.default pair that bug 2 missed). Also adds "BAP returns 404 (not 403) when handed an unknown env id" to prevent misdiagnosing wrong-id as permission failure.

  2. New subsection "Sanity-checking a capture before pinning it" (under "What counts as the same endpoint") — documents the failure mode that produced bug 2. The cardinal rule was supposed to prevent guessing, but a misdiagnosed capture locks in the wrong contract with full ceremony. Adds four sanity rules: cite vendor docs; re-capture against a second tenant; treat 4xx-on-happy-path as a red flag; don't pin a 404 regression test before verifying the endpoint is correct.

  3. New paragraph "validatable validates shape, not values" (under "If the tier is validatable") — documents the failure mode that produced bug 3. CSDL guarantees field shape, not value vocabulary. Requires vendor-published reference list citation, case-insensitive matching by default, bundle-expansion awareness, and a unit test that loads the reference table.

  4. Annotated runner.env_id table row — was "Power Platform (BAP) environment ID" (true but easy to read past). Now explicitly says NOT the Dataverse OrgId, names the derivation pattern, and cross-references the new identity-gotchas section.

  5. New "step 5a. End-to-end smoke against a real tenant" — required PR-description content for any new check that calls a host/audience pair not covered by an existing runner.<client> method: the exact CLI command, the output line showing the new checkpoint firing with a terminal status (NOT SKIPPED), and confirmation the run was against a real tenant. Mocks validate logic; live runs validate that the endpoint exists. PR flightcheck: add WD-ENV-101 Workday ISU username vs Entra UPN format alignment #87's three bugs all had passing unit tests precisely because the mocks themselves were wrong.

Why this is its own PR

Separation of concerns: PR #87 contains the code fixes plus the test infrastructure that aligns with them (including INDEX.md row corrections — those are tightly coupled to the code change). This PR contains the aspirational rules for future work — distinct review concern, distinct audience (process-minded reviewers vs. code-minded reviewers).

Out of scope

  • Re-capturing flightcheck_pp_admin.yaml against the correct api.flow.microsoft.com host (noted as a TODO in INDEX.md by PR flightcheck: add WD-ENV-101 Workday ISU username vs Entra UPN format alignment #87).
  • Promoting these rules into tests/AGENTS.md (the companion test-authoring file) — cross-references in flightcheck/AGENTS.md make the new rules discoverable from there, but a dedicated mirror could come in a follow-up if reviewers prefer.

Related: #87

While validating PR #87 (WD-ENV-101 ISU username vs Entra UPN format
alignment) end-to-end against a real tenant, three latent bugs were
discovered, all from the original May 7 Python scripts commit:

1. BAP env id derivation used Dataverse OrgId (different GUID on most
   tenants) - gated every BAP-scoped check to a 404.
2. Flow listing endpoint hit api.powerapps.com when the admin flow
   API actually lives on api.flow.microsoft.com with its own
   audience token.
3. License SKU matchers were case-sensitive against UPPER_CASE patterns
   while Graph returns mixed-case values - PRE-001/002/003 falsely
   FAILED on tenants with valid licenses.

Bugs and tests were fixed in PR #87 itself. This PR is the corresponding
process update so future agents don't reintroduce the same classes of
mistake.

Five guidance changes to flightcheck/AGENTS.md:

1. **New top-level section "Power Platform identity gotchas"**
   - Promotes the BAP-id-vs-Dataverse-OrgId gotcha from the buried PVA
     subsection (it had been in the Island Gateway section since the
     file's creation) to a cross-cutting rule that applies to every
     client that needs a BAP env id.
   - Adds "BAP returns 404 (not 403) when handed an unknown env id" -
     prevents misdiagnosing a wrong env id as a permission failure.
   - Adds the host-audience pairing table for the six API surfaces the
     kit uses, including the api.flow.microsoft.com /
     service.flow.microsoft.com//.default pair that bug #2 missed.
   - Removes the now-duplicate gotcha note from the Island Gateway
     section, replacing with a back-reference.

2. **New subsection "Sanity-checking a capture before pinning it"**
   - Lives under "What counts as the same endpoint" in the cardinal
     rule.
   - Documents the failure mode that produced bug #2: a 404 from the
     wrong host was captured and pinned as "expected behavior for
     Dataverse-only environments." The validated tier was supposed to
     prevent guessing, but a misdiagnosed capture locks in the wrong
     contract with full ceremony.
   - Adds four sanity-check rules: cite vendor docs; re-capture against
     a second tenant; treat 4xx-on-happy-path as a red flag; don't pin
     a 404 regression test before verifying the endpoint is correct.

3. **New paragraph "validatable validates shape, not values"**
   - Lives under "If the tier is validatable."
   - Documents the failure mode that produced bug #3: the Graph CSDL
     told us subscribedSkus.skuPartNumber is Edm.String but said nothing
     about case-of-the-day or bundle expansion. The validatable tier
     guarantees shape, not vocabulary.
   - Requires vendor-published reference list citation
     (licensing-service-plan-reference), case-insensitive matching by
     default, bundle-expansion awareness, and a unit test that loads
     the reference table.

4. **Annotated runner.env_id table row**
   - Existing row said "Power Platform (BAP) environment ID" - true but
     easy to read past. Now explicitly says NOT the Dataverse OrgId,
     names the derivation pattern, and cross-references the new
     identity-gotchas section.

5. **New "step 5a. End-to-end smoke against a real tenant"**
   - Required PR-description content for any new check that calls a
     host/audience pair not covered by an existing runner.<client>
     method: the exact cli command, the output line showing the new
     checkpoint firing with a terminal status (NOT SKIPPED), and
     confirmation the run was against a real tenant.
   - Mocks validate logic; live runs validate that the endpoint exists
     and the gate chain feeds the check. PR #87's three bugs all had
     passing unit tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

@nehaoss nehaoss left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent post-mortem documentation — this is the kind of institutional knowledge that prevents repeat mistakes. Comments:

  1. Power Platform identity gotchas section: The host ↔ audience table is extremely valuable. The \�pi.flow.microsoft.com\ vs \service.powerapps.com\ token mismatch being called out as 'the classic symptom' is exactly the kind of pattern documentation that saves debugging time.

  2. Cassette sanity-checking section: Strong guidance. The requirement to 'cite the vendor doc URL' and 're-capture against a second tenant' for anomalous statuses directly addresses the root cause of PR #87's issues. The principle '4xx from a happy-path call is a red flag, not a feature' is well stated.

  3. \�alidatable\ validates shape, not values: The SKU matching example (Teams delivered via O365_BUSINESS_PREMIUM, ENTERPRISEPACK, etc.) is concrete and actionable. Good that it explicitly calls out case-insensitive matching.

  4. End-to-end smoke requirement (5a): The requirement that SKIPPED ≠ exercised is an important distinction. One suggestion: consider specifying a minimum bar for what 'real tenant' means — e.g., should it be a non-production tenant to avoid accidental impact, or any tenant with the relevant resources?

  5. runner.env_id clarification: The expanded description in the Runner attributes table is a high-value one-line change that prevents future confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants