Skip to content

feat: add BTree global index core components#74

Open
lszskye wants to merge 1 commit into
apache:mainfrom
lszskye:p7-3
Open

feat: add BTree global index core components#74
lszskye wants to merge 1 commit into
apache:mainfrom
lszskye:p7-3

Conversation

@lszskye

@lszskye lszskye commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

feat: add BTree global index core components

Changes

BtreeDefs

  • Constants struct defining BTree index configuration keys and default values: compression algorithm, compression level, block size, cache size, high-priority pool ratio, and read buffer size. Also defines the "btree" identifier.

BTreeFileFooter

  • Footer structure for BTree index files containing bloom filter handle, index block handle, and null bitmap handle. Supports versioned serialization/deserialization via Read()/Write() with a magic number (0x50425449) for validation. Fixed encoding length of 52 bytes.

BTreeFileMetaSelector

  • Implements FunctionVisitor<std::vector<GlobalIndexIOMeta>> to select candidate BTree index files based on filter predicates.

BTreeIndexMeta

  • Per-file metadata holding serialized first key, last key, and a has_nulls flag. Keys can be null when the entire file contains only null values (OnlyNulls()). Provides Serialize()/Deserialize() for binary persistence with a 9-byte fixed header (two 4-byte key lengths + 1-byte null flag).

KeySerializer

  • Static utility for serializing Literal values to Bytes and deserializing MemorySlice back to Literal, supporting all Paimon data types. Provides CreateComparator() to build a MemorySlice::SliceComparator for binary key comparison based on Arrow data type.

Tests

  • BTreeFileFooterTest
  • BTreeFileMetaSelectorTest
  • BTreeIndexMetaTest
  • KeySerializerTest

@leaves12138 leaves12138 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for migrating the BTree global index core components. I reviewed PR #74 at head 5130412cd9e90563f79f7836060b6f817a845636.

Treating the remaining prerequisite dependency state as expected migration-period drift, I did not find a blocking correctness issue in the migrated BTree definitions, file footer, index metadata, file meta selector, key serializer, and related tests. The migrated files match the existing C++ source snapshot aside from Apache license header updates.

Please make sure the dependent migration PRs are merged or rebased in, and run the relevant C++ build/tests from a clean checkout before final merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants