Skip to content

Vectorize _sort_and_stride 3D reindex, drop deepcopy (#3381)#3382

Merged
brendancol merged 1 commit into
mainfrom
deep-sweep-performance-zonal-2026-06-17
Jun 18, 2026
Merged

Vectorize _sort_and_stride 3D reindex, drop deepcopy (#3381)#3382
brendancol merged 1 commit into
mainfrom
deep-sweep-performance-zonal-2026-06-17

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #3381

What changed

  • The 3D branch of _sort_and_stride deep-copied the entire values array and then reindexed each layer by sorted_indices in a Python loop. It now does the reindex with one vectorized fancy-index, values.reshape(n, -1)[:, sorted_indices]. Fancy indexing returns a fresh array, so the input is never mutated and the explicit copy is gone.
  • Removed the now-unused import copy.

This path runs in crosstab() on the numpy backend and once per block on dask+numpy, so the redundant full-array copy was paid per block at scale. The cupy backend rejects 3D input, so it is unaffected.

Backend coverage

numpy, dask+numpy (the only backends that reach the 3D _sort_and_stride path).

Test plan

  • New test_sort_and_stride_3d_no_mutation_and_parity pins input non-mutation and layer-wise sort parity.
  • Existing 3D crosstab tests pass (test_crosstab_3d_count, test_crosstab_3d_agg_method, test_nodata_values_crosstab_3d), all of which already assert the input is unmodified.
  • Full test_zonal.py (194) and test_zonal_backend_coverage_2026_05_27.py (43) pass.

The 3D branch of _sort_and_stride deep-copied the whole values array
then reindexed each layer by sorted_indices in a Python loop. Replace
both with a single vectorized fancy-index that returns a fresh array,
so the input is never mutated and no explicit copy is needed. Removes
the now-unused `import copy`.

Runs in crosstab() on numpy and once per block on dask+numpy; cupy
rejects 3D input so it is unaffected.

Adds test_sort_and_stride_3d_no_mutation_and_parity pinning input
non-mutation and layer-wise sort parity.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 17, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Vectorize _sort_and_stride 3D reindex, drop deepcopy (#3381)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • test_zonal.py (new test_sort_and_stride_3d_no_mutation_and_parity): the parity assertion recomputes expected with the same expression the implementation now uses (values.reshape(n, -1)[:, sorted_indices]), so it is not an independent oracle. The non-mutation assertion also would not have failed against the old deepcopy code, since the old path mutated a copy rather than the input. The test still earns its place as a guard against a future in-place-mutation or shape regression, and test_crosstab_3d_count / test_crosstab_3d_agg_method already provide the independent reference-value check. No change needed, just noting the test's scope.

What looks good

  • The replacement is equivalent: reshape returns a view when the array is contiguous, and the fancy-index [:, sorted_indices] always allocates a fresh array, so the input is never touched (confirmed empirically).
  • The deepcopy and the Python row loop both collapse into one vectorized op.
  • Removing the now-unused import copy is the right cleanup.
  • Backend scope is correct: only numpy and dask+numpy reach the 3D branch; cupy rejects 3D upstream, so nothing GPU-side is affected.
  • Existing 3D crosstab tests already assert the input is unmodified and check against reference values.

Checklist

  • Algorithm matches reference: n/a (mechanical refactor, output identical)
  • All implemented backends consistent: yes (numpy + dask+numpy share the helper; cupy unaffected)
  • NaN handling correct: unchanged (NaN filtering happens downstream of this helper)
  • Edge cases covered: yes (existing 3D tests + new unit test)
  • Dask chunk boundaries handled: n/a (per-block helper, no chunk logic changed)
  • No premature materialization or unnecessary copies: improved (one full-array copy removed)
  • Benchmark exists or not needed: existing benchmarks/benchmarks/zonal.py covers stats and regions; crosstab 3D is unbenchmarked but this is a refactor, not a new function
  • Docstrings present and accurate: n/a (private helper)

@brendancol brendancol merged commit 1639da1 into main Jun 18, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

zonal: redundant deepcopy and Python loop in _sort_and_stride 3D branch

1 participant