Skip to content

Optimize int add/sub for wide exact ints #151289

@KRRT7

Description

@KRRT7

CPython has a fast path for compact integers in binary add/sub, but wide exact ints still go through the generic long arithmetic path even when both operands fit in int64_t.

This issue proposes adding a separate fast path for exact PyLong operands that fit in signed 64-bit integers, while preserving the existing compact-int path.

Suggested implementation:

  • Keep the current compact-int specialization unchanged.
  • Add a separate wide-int path for exact ints that fit in int64_t.
  • Preserve current behavior for overflow, subclasses, and other non-exact-int cases.

Motivation:

  • Improve performance for wide integer add/sub without affecting the common compact-int hot path.
  • Avoid adding new opcodes in the compact-int path.
  • Fit within the current interpreter and specialization structure.

Benchmark evidence:

  • I prototyped this locally with a benchmark covering compact and wide add/sub cases.
  • Wide cases improved substantially, while compact cases remained effectively flat.
  • Representative interpreter-only results with JIT disabled:
    • add_wide: about 25% faster
    • sub_wide: about 35% faster
    • add_compact/sub_compact: effectively unchanged

Benchmark script used locally:

"""Microbenchmark compact vs wide int add/sub with pyperf.

Use this with PYTHON_JIT=0 and -S if you want a stable interpreter-only run:

    PYTHON_JIT=0 ./python.exe -S Tools/scripts/bench_wide_int_pyperf.py
"""

from __future__ import annotations

import pyperf


def bench_add_compact() -> int:
    a = 1
    b = 2
    return a + b


def bench_add_wide() -> int:
    a = 10_000_000_000
    b = 1
    return a + b


def bench_sub_compact() -> int:
    a = 1
    b = 2
    return a - b


def bench_sub_wide() -> int:
    a = 10_000_000_000
    b = 1
    return a - b


def main() -> None:
    runner = pyperf.Runner()
    runner.bench_func("add_compact", bench_add_compact)
    runner.bench_func("add_wide", bench_add_wide)
    runner.bench_func("sub_compact", bench_sub_compact)
    runner.bench_func("sub_wide", bench_sub_wide)


if __name__ == "__main__":
    main()

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetype-featureA feature request or enhancement
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions