Skip to content

perf(shapes): faster datashader circle rendering + matplotlib-fidelity fixes#729

Merged
timtreis merged 18 commits into
mainfrom
perf/shapes-circle-rendering
Jun 21, 2026
Merged

perf(shapes): faster datashader circle rendering + matplotlib-fidelity fixes#729
timtreis merged 18 commits into
mainfrom
perf/shapes-circle-rendering

Conversation

@timtreis

@timtreis timtreis commented Jun 20, 2026

Copy link
Copy Markdown
Member

No description provided.

@timtreis timtreis force-pushed the perf/shapes-circle-rendering branch 2 times, most recently from 2fbec2c to ce587c6 Compare June 20, 2026 15:05
Circles (Point+radius) were buffered to polygons at shapely's default resolution=16
(65 vertices/circle) before datashader rasterization. For large circle sets this
coordinate explosion dominates the render (buffer + per-vertex transform + polygon
aggregation), e.g. ~5.9M coords for 91k circles.

Choose the buffer resolution from the largest disc's on-screen pixel radius
(_circle_buffer_quad_segs / _circle_quad_segs): 4 segments/quadrant for small discs
(<=8px, where extra vertices are sub-pixel), 8 (<=32px), and shapely's full 16 once
discs are large enough to show facets. Faithful (IoU >=0.98 vs the 65-vertex circle)
and handles per-circle varying radii.

End-to-end on Visium HD (single coordinate system): 91k circles 2.0s->1.5s,
352k circles 8.3s->4.9s.

Note: shifts the datashader-circle visual baselines (17- vs 65-vertex circles);
regenerate those from CI artifacts.
@timtreis timtreis force-pushed the perf/shapes-circle-rendering branch from ce587c6 to b6927ac Compare June 20, 2026 15:10
@timtreis timtreis closed this Jun 20, 2026
@timtreis timtreis reopened this Jun 20, 2026
@codecov-commenter

codecov-commenter commented Jun 20, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.38%. Comparing base (480d9f0) to head (ae5b7fa).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #729      +/-   ##
==========================================
+ Coverage   79.21%   79.38%   +0.17%     
==========================================
  Files          17       17              
  Lines        4566     4604      +38     
  Branches     1026     1031       +5     
==========================================
+ Hits         3617     3655      +38     
  Misses        599      599              
  Partials      350      350              
Files with missing lines Coverage Δ
src/spatialdata_plot/pl/_datashader.py 88.04% <100.00%> (+0.50%) ⬆️
src/spatialdata_plot/pl/basic.py 82.81% <ø> (ø)
src/spatialdata_plot/pl/render.py 89.54% <100.00%> (+0.25%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@timtreis timtreis force-pushed the perf/shapes-circle-rendering branch from b6927ac to bb22094 Compare June 20, 2026 15:26
On the datashader backend, buffering every circle (Point+radius) to a polygon dominates
the render for large sets. A large (>50k), uniform-radius, outline-free circle element is
a dot-field where a filled disc and a spread point are visually equivalent, so rasterize
centroids as radius-faithful points (_circles_render_as_points gate + radius-aware spread
in _datashader_points) instead of buffering. This is datashader-backend behavior; use
method="matplotlib" for a pixel-exact rendering. Per-circle varying radii, outlines, and
custom shapes keep the polygon path.

`as_points` stays a simple bool (style: dots vs geometry) and is itself a speedup on both
backends; it is orthogonal to this datashader optimization.

Adds a 2x2 visual test (geometry/as_points x matplotlib/datashader) verifying the four
render paths look alike, incl. the datashader fast-path matching exact matplotlib discs.

End-to-end on Visium HD (single coordinate system): 91k circles 6.85s->0.56s vs v0.4.0,
352k 22.95s->1.14s, 5.5M 002um impractical->~12.7s; method="matplotlib" stays exact.
@timtreis timtreis force-pushed the perf/shapes-circle-rendering branch from bb22094 to 68cc388 Compare June 20, 2026 15:33
timtreis added 15 commits June 20, 2026 17:43
Adds 2x2 visual tests mirroring the circle one, for the other elements whose render
paths split by backend / as_points, so divergence or breakage across paths is visible:

- points:   (no color / continuous) x (matplotlib / datashader)
- labels:   (fill / as_points)       x (matplotlib / datashader)
- polygons: (geometry / as_points)   x (matplotlib / datashader)

Colorbars disabled and marker sizes bumped so panels stay non-degenerate; each panel is
titled with its (mode x backend) combination. New baselines to be generated from CI.
Generated from the py3.11-stable CI artifact. Only the 4 new 2x2 permutation tests
needed baselines; the existing datashader-circle baselines stayed within tolerance under
the adaptive-buffer change.
PlotTester.compare() force-shrinks every figure to a 400x300 / 5x3.75in thumbnail.
matplotlib scatter markers are point-sized (absolute) and don't shrink with the squished
axes, while datashader as_points (a data-coordinate raster) and the geometry do — so the
as_points/matplotlib panel rendered ~1.6x oversized vs the other three. Rendering the 2x2
grids at the harness canvas size/dpi makes compare()'s resize a no-op, so the point-sized
scatter and the data-coordinate paths stay consistent. (Not a library bug: at a native
render size mpl and datashader as_points agree.)
…ender

Re-rendered at the harness canvas size: the as_points grids (circle/polygon/labels) now show
matplotlib and datashader at matching sizes. The points grid retains the known render_points
matplotlib-vs-datashader marker-size difference (looser sqrt(s)*dpi/100 spread calibration).
…plotlib

render_points sized its datashader canvas to the full figure (fig.get_size_inches()*dpi),
so when the axes was a subplot the raster was built at figure resolution then imshow'd into
the smaller axes and the markers shrank — matplotlib-vs-datashader point sizes diverged by
the axes/figure ratio (e.g. ~1.8x in a 2x2 grid; they agreed only when the axes filled the
figure). It also used a looser sqrt(size)*dpi/100 spread vs matplotlib's exact sqrt(size)*
dpi/144 marker radius.

Size the points canvas to the axes box (ax.get_window_extent(), as the as_markers/as_points
path already did) and use the /144 marker-radius formula for both paths. The datashader dot
now matches the matplotlib scatter marker by construction, in any layout. Degenerate-extent
handling (single/coincident points) is preserved.

Permutation-grid sizes set to non-overlapping values (circle 25, points 30) now that the
mpl/datashader match is structural rather than tuned.
…c marker size

The render_points datashader marker-size fix (axes-box canvas + /144 spread) shifts all
render_points datashader baselines (point sizes now match matplotlib in any layout). Also
refreshes the points and circle permutation grids. Generated from py3.11-stable CI.
…ound

Two regressions from the earlier marker-size work, fixed:

1. render_points: sizing the datashader canvas to the axes box lowered its resolution, which
   changed point AGGREGATION (counts/reductions/density) — std/var grew spurious nonzero
   pixels, dots went blocky, colors shifted. Restore the figure-resolution canvas (aggregation
   identical to before) and instead scale only the marker spread by canvas/axes so dot size
   still matches matplotlib in any layout. Colors/aggregation unchanged; size deterministic.

2. circles: the adaptive quad_segs coarsened *visible* discs (they looked octagonal vs the
   matplotlib circles). Only coarsen sub-pixel discs (≤2px, where it's invisible and where the
   Visium HD speedup lives — HD spots are ~0.3-0.6px); any visible disc keeps the round default.

HD spots stay sub-pixel → quad_segs=4 → speedup preserved (91k still ~0.58s/CS).
The earlier attempts to make render_points datashader markers match matplotlib in multi-panel
layouts all regressed real rendering: the axes-box canvas changed point aggregation (std/var
gained spurious values, dots went blocky, colors shifted); the canvas/axes spread scaling
overshot when a legend shrank the axes; and the data->display transform isn't valid at render
time (axis limits not yet set). render_points single-panel rendering already matches matplotlib
(~0.95); the multi-panel difference is the figure-vs-axes raster scale, compounded by the test
harness squishing figures to a 400x300 thumbnail.

Per 'be accurate in real plotting; note and ignore harness artifacts': revert render_points to
its original (correct) sizing, restore its baselines, and document the grid caveat. Keep the
circle work (Phase 1/2 + conservative quad_segs so visible circles stay round).
…original sizing

render_points reverted to its original sizing, so the grid baseline (previously the broken
axes-box version) is regenerated. The documented multi-panel/harness size difference between
the matplotlib and datashader columns is expected; single-panel rendering matches.
Datashader markers shrank in multi-panel subplots: the spread radius used
sqrt(size)*dpi/100 on a figure-resolution canvas, so the on-screen size scaled
with axes_window/figure and halved in a 2x2 grid. Rescale the spread by the
axes-box/canvas factor ratio so the displayed radius stays at the matplotlib
marker radius (sqrt(size)*dpi/144) in any layout. Unifies the render_points and
as_markers paths (ratio is 1 for the axes-box canvas) and drops the 144-vs-100
split. Aggregation canvas is unchanged, so std/var/count are unaffected.
… size

Datashader markers now match matplotlib in any panel layout. Three baselines
shifted (multi-panel grid, multi-panel groups/na_color, and the dpi size-agree
test); all single-panel datashader baselines stayed within tolerance and
shapes/labels centroid baselines are unchanged (axes-box ratio is 1).
…radius

Two pre-existing datashader fidelity issues exposed by the render-permutation
grids:

1. render_points continuous color defaulted to reduction "sum", which inflates
   the normalization range where dots overlap and pushes single points to the
   dark end of the colormap (datashader looked nothing like matplotlib). Switch
   the default to "max" (each pixel shows its own value, matching matplotlib and
   the as_points path). The spread step also has to follow the *resolved*
   reduction: it defaulted to "add" for ds_reduction=None, summing overlapping
   dilated dots and undoing the "max" aggregate. Now the spread how uses
   `ds_reduction or default_reduction`.

2. as_points=True on uniform-radius circles now sizes the datashader dots to the
   true disc radius, so they match the geometry render (and the circle
   fast-path). The matplotlib backend keeps the marker `size` (scatter markers
   are display-sized, not data-sized) — documented as an expected backend
   difference.
…radius

Continuous datashader points now use the "max" reduction (full colormap range
instead of sum-darkened), and uniform-circle as_points dots are sized to the
true radius on the datashader backend. Regenerate the six affected continuous
point baselines and the shapes as_points datashader baseline; clarify the
as_points test docstrings (matplotlib stays size-based, an expected backend
difference).
Drop the datashader-only circle-radius override for as_points: it made the same
render_shapes(as_points=True, size=...) call diverge between backends (datashader
discs vs matplotlib markers). as_points is a size-controlled speedup; both
backends now use the marker size, matching each other (as the polygon
permutation grid already demonstrates). Restores the pre-override as_points
datashader baseline. Keeps the layout-invariant marker-size fix and the faithful
continuous-color reduction, which are what make the backends agree.
- _datashader_points default_reduction default "sum"->"max" to match both call
  sites (removes a latent footgun: a future caller omitting it would silently
  re-inflate continuous color).
- Drop the duplicate ax.get_window_extent() in the marker-spread branch; make
  the factor==factor_axesbox (ratio 1) identity explicit for as_markers.
- Trim the one-liner helper docstrings/comments to the load-bearing why.
- Strengthen tests: gate test covers NaN radius; fast-path test spies the
  centroid renderer to prove the fast path actually fired (not just "an image");
  soften the layout-invariance docstring to the real <1px guarantee.
@timtreis timtreis changed the title perf(shapes): faster circle rendering on the datashader backend perf(shapes): faster datashader circle rendering + matplotlib-fidelity fixes Jun 20, 2026
- Extract _affine_major_scale() for the SVD major-axis stretch duplicated by
  the fast-path and _circle_buffer_quad_segs.
- Fast-path: coerce only the first radius value (gate guarantees uniform+finite)
  instead of re-coercing the whole column — drops an O(n) pass at HD scale.
- Drop a comment that restated the adjacent log line; drop a redundant bool().
@timtreis timtreis merged commit 078afb1 into main Jun 21, 2026
7 of 8 checks passed
@timtreis timtreis deleted the perf/shapes-circle-rendering branch June 21, 2026 00:08
@timtreis timtreis restored the perf/shapes-circle-rendering branch June 21, 2026 11:19
@timtreis timtreis deleted the perf/shapes-circle-rendering branch June 21, 2026 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants