Stable sort for school escorting#1085
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves determinism and cross-platform consistency in the school escorting model by (1) enforcing stable sorting where ordering affects assignment/processing and (2) standardizing key ID/counter columns to 64-bit integers to avoid Windows 32-bit int pitfalls.
Changes:
- Add
kind="stable"to keysort_valuescalls to make tie-breaking deterministic. - Replace platform-native
astype(int)casts withastype("int64")for tour/person-related integer columns. - Apply stable ordering before downstream computations (e.g., pure escort tour construction and escort bundle ordering).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| activitysim/abm/models/util/school_escort_tours_trips.py | Uses int64 for ID-like columns and applies stable sorting when ordering impacts pure escort tour processing. |
| activitysim/abm/models/school_escorting.py | Applies stable sorts in participant numbering and bundle ordering; switches chauffeur-related numeric fields to int64. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
@americalexander Is this PR motivated by an actual finding of non-reproducibility, or just having discovered the theoretical potential of such? |
|
@jpn-- this is an actual problem, we found this independently when looking into unexpected differences in EET scenarios, let's discuss this in today's engineering meeting. I think it would be good to have a test here that shows how this manifests and how the new code fixes it. |
|
As Jan mentioned, this was something we discovered in practice. CCing @dhensle |
Address multiple locations where the school escorting model does not sort in a stable manner. Also use 64-bit integers for tour and person IDs.
Changes
Stable Sorting
Added
kind=""stable""tosort_valuescalls throughout the school escorting model to ensure deterministic, reproducible results regardless of the initial row ordering of input data. Without stable sorting, ties in the sort key can be broken arbitrarily, leading to non-reproducible outputs across runs or platforms.Affected locations:
determine_escorting_participants: stable sort of chaperones bychaperone_weight(descending) when assigningchaperone_num, and stable sort of escortees byage(ascending) when assigningescortee_numschool_escorting: stable sort of school escort tours byhousehold_idandschool_escort_directionbefore further processingcreate_pure_school_escort_tours: stable sort of pure escort tours byhousehold_id,person_id, andstarttime64-bit Integer Types
Replaced
.astype(int)with.astype(""int64"")for tour IDs, person IDs, and related integer columns. Using the platform-nativeinttype can produce 32-bit integers on Windows, which risks integer overflow for large ID values and inconsistent behavior across operating systems.Affected locations:
chauf_numandchauf_idincreate_school_escorting_bundles_tablejoin_attributestour_idincreate_chauf_escort_tripsnum_escorteesinprocess_tours_after_escorting_model