Skip to content

feat: Migrate bucket module#71

Merged
leaves12138 merged 1 commit into
apache:mainfrom
lxy-9602:core-bucket-id
Jun 10, 2026
Merged

feat: Migrate bucket module#71
leaves12138 merged 1 commit into
apache:mainfrom
lxy-9602:core-bucket-id

Conversation

@lxy-9602

Copy link
Copy Markdown
Contributor

Purpose

No Linked issue.

Migrate bucket ID calculation, bucket functions, and bucket select converter:

Bucket interfaces (include/paimon/bucket/):

  • BucketIdCalculator — computes bucket ID for a given row (bucket_id_calculator.h)
  • BucketFunctionType — enum for bucket function types (bucket_function_type.h)

Bucket functions (src/paimon/core/bucket/):

  • BucketFunction — abstract bucket function interface (bucket_function.h)
  • BucketIdCalculator — bucket ID calculation with hash-based routing (bucket_id_calculator.cpp)
  • HiveBucketFunction — Hive-compatible bucket hash function (hive_bucket_function.h/cpp)
  • HiveHasher — Hive ObjectInspector-compatible hashers for all data types (hive_hasher.h)
  • ModBucketFunction — simple modulo-based bucket function (mod_bucket_function.h/cpp)
  • DefaultBucketFunction — default bucket function using MurmurHash (default_bucket_function.h)
  • BucketSelectConverter — converts predicates to bucket filter for scan pruning (bucket_select_converter.h/cpp)

Tests

  • bucket_id_calculator_test.cpp — bucket ID calculation correctness
  • hive_bucket_function_test.cpp — Hive bucket hash Java compatibility
  • mod_bucket_function_test.cpp — modulo bucket function
  • default_bucket_function_test.cpp — default bucket function
  • bucket_select_converter_test.cpp — predicate to bucket filter conversion

API and Format

Documentation

Generative AI tooling

Migrate-by: Aone Copilot (Claude)

@lxy-9602

Copy link
Copy Markdown
Contributor Author

Thank you @ChaomingZhangCN for contributing the bucket function framework, and @liangjie3138 for the BucketSelectConverter — migrated as part of this batch. 🎉

@leaves12138 leaves12138 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for migrating the bucket module. I reviewed PR #71 at head ba8560c200f768354b795290dd017225c1a671db.

Treating the remaining prerequisite dependency state as expected migration-period drift, I did not find a blocking correctness issue in the migrated bucket id calculator, default/mod/hive bucket functions, bucket select converter, Hive hasher, and related tests. I also checked the PR-local Decimal128 reconstruction change in bucket_id_calculator.cpp; it avoids shifting a signed negative high word and looks like a correct safety fix rather than a behavior regression.

Please make sure the dependent migration PRs are merged or rebased in, and run the relevant C++ build/tests from a clean checkout before final merge.

@leaves12138 leaves12138 merged commit 3ced20a into apache:main Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants