Skip to content

flightcheck: port BL-014 network validator + Workday REST diagnostic + contributor guardrails#94

Open
johnguy0 wants to merge 3 commits into
mainfrom
users/johnguy0/port-preflight-validator-may-2026
Open

flightcheck: port BL-014 network validator + Workday REST diagnostic + contributor guardrails#94
johnguy0 wants to merge 3 commits into
mainfrom
users/johnguy0/port-preflight-validator-may-2026

Conversation

@johnguy0

Copy link
Copy Markdown
Collaborator

Description

Port of 5 commits from JohnQuinn-Dev/ess-preflight-validator (since 2026-05-01) into this repo's Python FlightCheck framework.

Source commits → ported as:

Source SHA Source title Lands here as
9ed2055 feat: BL-014 Network Connectivity validator + integration selection Commit 1checks/network.py, checks/firewall_export.py, config/required-endpoints.json, lazy-auth gate in cli.py, new "Vendor TCP/HTTPS reachability" tier-registry row
5eb19bc + 44ac58d Add REST Endpoints Pre-flight Validation Script (PR #3 by @is-goutham) Commit 2scripts/diagnostics/test_workday_rest_endpoints.py (standalone Python OAuth diagnostic) + WD-REST-MANUAL checklist entry in checks/workday.py + tier-registry note
70065b9 Add SSO test flow template (WIP) Commit 2 — vendored to src/reference/workday-sso-test-flow/ with adapted README
2ac16e9 Add contributor guardrails: PR template + pre-PR checklist Commit 3 — merged into existing .github/PULL_REQUEST_TEMPLATE.md + CONTRIBUTING.md, translated to this repo's conventions

Related issue

Refs the BL-014 backlog item from the source repo (no equivalent issue tracked here yet — happy to file one if maintainers want it for changelog purposes).

Type of change

  • New feature (non-breaking change which adds functionality)
  • Documentation update

Testing

pytest tests/ -q     → 133 passed, 8 skipped (baseline was 103; +30 new tests, 0 regressions)

New test modules:

  • tests/flightcheck/checks/test_network.py (15 tests covering the 6 failure modes + selection/config edge cases)
  • tests/flightcheck/checks/test_firewall_export.py (7 tests, golden-file markdown render)
  • tests/flightcheck/test_cli_lazy_auth.py (10 tests pinning per-scope auth requirement)

Smoke-tested in a temp working dir with a minimal .local/config.json:

  • cli.py --export-firewall-requirements → writes workspace/flightcheck/firewall-requirements.md, no Microsoft auth attempted
  • cli.py --scope network → runs with Skipping Microsoft auth (not required for --scope network) banner, executes probes
  • diagnostics/test_workday_rest_endpoints.py --help → parses cleanly

Full E2E run of the Workday REST diagnostic requires a customer Workday tenant + an OAuth API Client, which is the documented design — see scripts/diagnostics/README.md.

Key design decisions

After the rubber-duck pass caught 10 issues in the initial plan, the final design:

  1. Workday REST went to a standalone diagnostic, not an in-runner FlightCheck check. The source uses OAuth 2.0 Authorization Code with a browser-opened login — that's Workaround This repo is missing a LICENSE file #1 explicitly rejected for runtime FlightCheck checks in tests/fixtures/cassettes/INDEX.md's "Workday WQL config-validation pattern" section ("doesn't escape the chicken-and-egg"). The source repo didn't solve this either — it accepts the customer doing the OAuth API Client registration as a prerequisite. Faithful port = same interactive UX, just in Python.
  2. Network reachability is the exception to the cardinal rule. TCP/HTTPS reachability probes don't consume vendor API response contracts, so the validated/validatable/documented tier system doesn't bind. Documented this explicitly with a new "Vendor TCP/HTTPS reachability" row in the API tier registry so future readers don't have to re-derive the exemption.
  3. Lazy-auth gate via _requires_microsoft_auth(scope). Without it, --scope network would trigger an interactive MSAL prompt before any probe ran — defeating the shift-left value. Mirrors the existing PVA scope gate at cli.py:162 (which is the precedent).
  4. Injectable TcpProber / HttpsProber protocol, not responses/respx. Those libraries only mock HTTP-level traffic and can't fake socket.create_connection refusal. A protocol abstraction is the cleanest testable seam.
  5. Uses requests (not httpx). httpx is NOT in solutions/ess-maker-skills/scripts/requirements.txt — verified before coding. requests is, so we stay on it.
  6. Default OAuth UX is paste-the-redirect-URL. Spinning up a TLS-terminating localhost server would require shipping a self-signed cert (trust-store friction). The source PowerShell does paste; we match. Optional --listen flag activates a stdlib http.server loopback only when --redirect-uri is http://localhost... (advanced customers who registered an HTTP redirect URI).
  7. WD-REST-MANUAL is conditional. Only fires when _workday_flows was already detected — non-Workday customers see no noise.
  8. Secrets and PII hygiene. OAuth client secret / authorization code / access token / refresh token are never logged. Token-endpoint errors reduced to status + error class. Employee PII redacted in JSON output by default; --include-pii opt-in.

Pre-PR Checklist (REQUIRED)

  • Rebased on latest main — branched from 887dd41, no rebase needed
  • Files changed tab reviewed — 13 files (8 new, 5 modified); no accidental deletions
  • Local lint / tests passpytest tests/ -q clean; cli.py --help parses
  • No references to files outside the repo — N/A
  • Defaults match repo conventions — output paths default to workspace/flightcheck/; write tests in the Workday REST diagnostic are opt-in via --include-write-tests
  • FlightCheck integration — new network scope wired into SCOPE_MAP and FULL_SCOPE; tests under tests/flightcheck/checks/; new test_cli_lazy_auth.py pins the scope-to-auth mapping
  • API tier registry honored — new "Vendor TCP/HTTPS reachability" row added with rationale; existing Workday WQL/REST row updated with the standalone-diagnostic note
  • Docs updatedtests/fixtures/cassettes/INDEX.md, new scripts/diagnostics/README.md, new src/reference/workday-sso-test-flow/README.md, cli.py module docstring scopes list

Checklist

  • My code follows the existing style
  • I have added/updated tests where applicable
  • I have updated documentation as needed

Notes for reviewer

  • The 3 commits are independently reviewable in order. Commit 1 (network + lazy-auth) is the only one that touches cli.py; commit 2 is the largest by line count (mostly the diagnostic + its README); commit 3 is small docs-only.
  • The new WD-REST-MANUAL NotConfigured checkpoint will show up in any FlightCheck run scope that includes Workday — by design (it's a deployment checklist item). Let me know if you'd rather scope it tighter (e.g. only --scope publishing).
  • Source's CHANGELOG.md / VERSION.txt housekeeping from commit 9ed2055 is out of scope (this repo isn't versioned the same way; no top-level CHANGELOG). Happy to add one if maintainers want it.
  • Source's ESS-Validator-Session.ps1 interactive integration-selection menu is replaced by the config-driven network.integrations / network.servicenow_instance keys in .local/config.json — fits this repo's non-interactive CLI model.

johnguy0 and others added 3 commits May 20, 2026 01:24
…y-auth scope gate

Port of ess-preflight-validator commit 9ed2055 (PowerShell
Test-NetworkConnectivity.ps1 + Export-FirewallRequirements.ps1) into the
Python FlightCheck framework as in-runner checks.

What's new
----------
- NET-001 / NET-002 / NET-003 checkpoints for Workday / ServiceNow / SAP
  SuccessFactors outbound TCP+HTTPS reachability. Catches missing
  firewall allow-list entries, SSL inspection, and proxy interference
  WEEKS before deployment cutover, when corporate firewall change
  requests still have time to land.
- New `--scope network` and standalone `--export-firewall-requirements`
  modes on the FlightCheck CLI.
- Vendored `required-endpoints.json` (Workday + ServiceNow + SAP only;
  Microsoft endpoints intentionally excluded, with links to Microsoft's
  authoritative allowlist docs from the JSON).

Architecture notes
------------------
- New `_requires_microsoft_auth(scope)` gate in `cli.py` mirrors the
  existing PVA scope gate so `--scope network` and
  `--export-firewall-requirements` skip Dataverse / Graph / Power
  Platform Admin auth entirely. Otherwise the new transport-only check
  would still trigger an interactive MSAL prompt before any probe ran,
  defeating the shift-left value of BL-014.
- The check uses an injectable `TcpProber` / `HttpsProber` protocol;
  production uses stdlib `socket.create_connection` + `requests` HEAD,
  tests substitute deterministic fakes. `responses` and `respx` can't
  fake TCP socket refusal, so a probe-shaped abstraction is the
  cleanest seam.
- Concurrent probes via `ThreadPoolExecutor` (default 8) so a single
  silently-dropped packet doesn't serialize the whole check behind 5s
  timeouts.
- Integration selection via `runner.config['network']`:
    network.integrations: ["Workday", "ServiceNow", ...]   # opt-in
    network.servicenow_instance: "<instance>"               # required
                                                              # for SN
  No `workday_tenant` knob (Workday hostnames are data-center based,
  not tenant-prefixed, per the JSON's `_hostingNote`).
- The cardinal rule (every external API call needs a validated /
  validatable / documented mock) does NOT bind transport-level probes
  that don't consume vendor API response contracts. A new "Vendor
  TCP/HTTPS reachability" row in the API tier registry documents this
  exception explicitly so future readers don't have to re-derive it.

Testing
-------
- 22 new tests across the 6 failure branches (REFUSED, TIMEOUT,
  DNS_FAILURE, TLS_ERROR, HTTP_5XX, HTTP_4xx-still-reachable) plus
  selection / config edge cases for `network.py`.
- Golden-file render tests for `firewall_export.py` against a fixed
  timestamp.
- `test_cli_lazy_auth.py` pins `_requires_microsoft_auth` per-scope to
  prevent future scopes from silently regressing the no-auth path.

Verified
--------
- `pytest tests/ -q` → 125 passed, 8 skipped (was 103 baseline; +22 new)
- `cli.py --export-firewall-requirements` writes markdown with no MS auth
- `cli.py --scope network` runs with "Skipping Microsoft auth (not
  required for --scope network)" banner; probes execute end-to-end

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…erence

Port of ess-preflight-validator commits 70065b9 (SSO test flow
template) and 5eb19bc / 44ac58d (Test-WorkdayRESTEndpoints.ps1).

Why a standalone diagnostic instead of an in-runner FlightCheck check
--------------------------------------------------------------------

Workday REST endpoints accept only OAuth 2.0 bearer tokens, and
obtaining one requires a customer-registered Workday OAuth API Client.
That's the same chicken-and-egg problem documented in
`tests/fixtures/cassettes/INDEX.md` under "Workday WQL config-validation
pattern" — validating the customer set up Workday correctly is hard
to automate because the validator itself needs the same kind of setup.

The source PowerShell repo accepts this and ships an interactive
customer-run script that prompts for the API Client credentials and
opens a browser for the Authorization Code flow. We do the same here in
Python so the runner's cassette-backed tier system stays intact: this
diagnostic lives outside the FlightCheck runner, and the runner
surfaces it via a single `WD-REST-MANUAL` `NotConfigured` checkpoint
in `checks/workday.py` that fires ONLY when Workday is configured for
the agent (so non-Workday customers see no noise).

What's new
----------
- `scripts/diagnostics/test_workday_rest_endpoints.py` — Python OAuth
  Authorization Code diagnostic for all 9 ESS Workday REST connector
  actions. Same checkpoint IDs as the source PowerShell
  (`WD-REST-AUTH` / `WD-REST-ME` / `WD-REST-001`..`WD-REST-008`).
  Default UX is paste-the-redirect-URL (works with the conventional
  `https://localhost:8888/callback` redirect URI without shipping a
  self-signed TLS cert). Optional `--listen` flag spins up an HTTP
  loopback server when the redirect URI is `http://localhost` — for
  customers who registered an HTTP redirect URI specifically.
- `scripts/diagnostics/README.md` — full operator-facing setup guide.
- `scripts/flightcheck/checks/workday.py` adds the `WD-REST-MANUAL`
  checklist entry surfacing the diagnostic to operators.
- `src/reference/workday-sso-test-flow/` — vendored Power Automate flow
  template that tests the OAuthUser Entra SSO connection via the
  `Get_Workers` SOAP operation. Reference content; not deployed by any
  script.
- API tier registry note in the existing Workday WQL/REST row
  documenting that the runner deliberately doesn't automate these 9
  endpoints and points readers at the standalone diagnostic.

Secrets and PII hygiene
-----------------------
- OAuth client secret, authorization code, access token, refresh token
  are NEVER logged. Token-endpoint error responses are reduced to
  status + error class so they can't leak the code or secret on
  failure. Mirrors the WS-Security UsernameToken redaction discipline
  in `checks/workday.py:_soap_call` (which has a CodeQL exemption row
  for the same reason — see commit 3ead8d7).
- `state` parameter generated via `secrets.token_urlsafe(24)` and
  verified on the callback. Rejects pasted URLs whose state doesn't
  match (catches CSRF + stale-paste mistakes).
- GetWorkerMe response PII (descriptor, primaryWorkEmail,
  businessTitle, primarySupervisoryOrganization.descriptor, raw WID)
  is redacted in the JSON output by default. `--include-pii` opt-in
  for in-tenant debugging.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Port of ess-preflight-validator commit 2ac16e9, adapted to this repo's
conventions.

What's new
----------
- `.github/PULL_REQUEST_TEMPLATE.md`: new "Pre-PR Checklist (REQUIRED)"
  section before the existing checklist.
- `CONTRIBUTING.md`: same checklist as a top-level subsection,
  cross-linked from / to the PR template.

Translations from the source's PowerShell-shaped checklist:
- "Rebased on latest `master`" → "Rebased on latest `main`"
- "Script runs without errors" → "`pytest tests/ -q` clean and
  `cli.py --help` parses"
- "`-OutputPath` defaults to `Desktop\\ESS-Reports\\…`" → "Output
  paths default to `workspace/flightcheck/…`"
- "Suite integration — new tests are wired into
  `Invoke-*ValidationSuite.ps1` as opt-in switches" → "FlightCheck
  integration — new checks are wired into `cli.py` `SCOPE_MAP` and
  `FULL_SCOPE`; tests live under `tests/flightcheck/checks/`"
- New line: "API tier registry honored — new external API calls
  reference the tier in `tests/fixtures/cassettes/INDEX.md`"
  (specific to this repo's cassette-tier discipline; not present in
  the source).

Why "rebased on latest `main`" matters: stale branches can silently
delete files added after the branch was cut. The source's callout
text is preserved (with the right repo's naming).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

def test_includes_all_endpoint_hosts(self, catalog_path: Path, tmp_path: Path) -> None:
text = _render(catalog_path, tmp_path)
assert "wd2-impl-services1.workday.com" in text
def test_includes_all_endpoint_hosts(self, catalog_path: Path, tmp_path: Path) -> None:
text = _render(catalog_path, tmp_path)
assert "wd2-impl-services1.workday.com" in text
assert "wd5.myworkday.com" in text
) -> None:
text = _render(catalog_path, tmp_path,
config={"network": {"servicenow_instance": "contoso"}})
assert "contoso.service-now.com" in text

# Confirm placeholder substitution happened.
sn_hosts = [call[0] for call in tcp.calls]
assert "contoso.service-now.com" in sn_hosts

@nehaoss nehaoss left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on the network validator port. The OAuth flow, PII redaction, auth gating, firewall export, and test coverage are well-structured. Found one counting bug in the HTTPS failure aggregation.

warning_count = sum(
1 for _, t, h in probe_results
if t.status == ProbeStatus.REACHABLE
and h.status in (ProbeStatus.HTTP_5XX, ProbeStatus.TLS_ERROR)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Incomplete HTTPS failure mode counting causes false 'Passed' status

warning_count only counts HTTP_5XX and TLS_ERROR HTTPS outcomes, but _RequestsHttpsProber.probe() can also return TIMEOUT (from RequestsTimeout, line ~147) and REFUSED (from RequestsConnectionError, line ~150).

When TCP succeeds but HTTPS returns either of these, the endpoint isn't counted in any of the three aggregation categories (reachable_count, warning_count, or failed_count), so the status falls through to Passed. The detail text does print [WARN] for these cases (line ~324), creating contradictory output.

Suggested fix:

and h.status in (ProbeStatus.HTTP_5XX, ProbeStatus.TLS_ERROR, ProbeStatus.TIMEOUT, ProbeStatus.REFUSED)

@nehaoss nehaoss left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR (2603 additions across 16 files) — porting a network validator + Workday REST diagnostic + contributor guardrails. Comments:

  1. Workday REST diagnostic standalone design: The rationale for placing this outside FlightCheck (chicken-and-egg OAuth problem) is well documented in the README. The decision to treat HTTP 400/422 as PASS for write endpoints is pragmatic — endpoint reachability is the goal, not business logic.

  2. Network check (
    etwork.py)
    : At 407 lines this is a substantial new check. Please confirm it's covered by the 381-line test file (\ est_network.py) — ratio looks healthy but verify all branches are exercised.


  3. equired-endpoints.json\ config
    : Externalizing the endpoint list as JSON is a good separation of concerns. Makes it easy for customers to audit/extend without reading Python.

  4. *\ est_cli_lazy_auth.py* (85 additions): Good to test that the CLI doesn't eagerly authenticate when running checks that don't need it — performance and UX improvement.

  5. Scope concern: 16 files across diagnostics, flightcheck checks, tests, fixtures, reference docs, and PR template changes. Consider whether this could be split into 2-3 smaller PRs for easier review (e.g., network validator separate from Workday REST diagnostic separate from contributor guardrails). If not, consider a more detailed PR description mapping each file to its purpose.

  6. PR template changes overlap with PR #105: Both PRs modify .github/PULL_REQUEST_TEMPLATE.md. Whoever merges second will have a conflict — coordinate merge order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants