Fix zstd decompression of replays with trailing bytes by Brutus5000 · Pull Request #281 · FAForever/faf-java-commons

Brutus5000 · 2026-06-26T20:28:23Z

Problem

Updating zstd-jni past 1.5.2-5 broke replay parsing: LoadReplayLoaderTest.parseBinary03 (and our production scenario) failed with com.github.luben.zstd.ZstdIOException: Unknown frame descriptor. Reported upstream as luben/zstd-jni#301.

Root cause

The zstd_reference.fafreplay payload (and matching older production files) is a complete, valid zstd frame followed by one stray trailing byte (0x0a, a newline) that is not part of the zstd stream.

zstd-jni ≤ 1.5.2 silently ignored bytes left over after a finished frame.
libzstd 1.5.4+ treats any post-frame bytes as the start of a second frame, reads the 0x0a, can't match a frame magic, and throws Unknown frame descriptor.

This is a behavior change in upstream libzstd's streaming decoder (the zstd-jni Java wrapper is byte-identical across these versions, and one-shot/CLI decoding has always rejected trailing data). The zstd CLI of every version rejects the same file with unknown header.

setContinuous(true) does not help — the frame is complete; the failure is parsing the (non-existent) next frame.

Fix

In the ZSTD branch of decompress(...), read exactly the frame's advertised content size so the decompressor stops at the frame boundary and never touches the trailing byte:

long contentSize = Zstd.getFrameContentSize(inputArray);
if (contentSize > 0) {
  IOUtils.copyLarge(compressorInputStream, out, 0, contentSize);
} else {
  IOUtils.copy(compressorInputStream, out);
}

The fallback covers frames without a stored content size (e.g. streamed files from the Rust replay writer, which carry no trailing junk).

The same decompress method is duplicated in ReplayLoader and ReplayDataParser; both are fixed to avoid drift.

Bumped zstd-jni to 1.5.6-10.

Testing

All 51 :data module tests pass, including parseBinary03 (zstd reference replay) which previously failed on every version after 1.5.2.

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes
- Improved handling of replay files compressed with Zstandard, including files that contain extra trailing bytes.
- Decompression now uses the expected frame size when available, helping avoid incomplete or over-read replay data.
- Updated the bundled Zstandard library to a newer version for better compatibility and reliability.

Some (older) replay files contain a stray trailing byte (a newline) after the zstd frame. zstd-jni <= 1.5.2 silently ignored bytes after a finished frame, but newer libzstd treats any post-frame bytes as the start of a second frame and fails with "Unknown frame descriptor". This is a streaming decoder behavior change in upstream libzstd; the zstd-jni Java wrapper is unchanged and one-shot/CLI decoding always rejected trailing data. Read exactly the frame's advertised content size so the decompressor stops at the frame boundary and never touches the trailing bytes. Falls back to a full read when the content size is not stored in the header (e.g. streamed files, which carry no trailing junk). This unblocks updating zstd-jni past 1.5.2; bumped to 1.5.6-10. Refs: luben/zstd-jni#301 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

coderabbitai · 2026-06-26T20:28:36Z

📝 Walkthrough

Walkthrough

The PR updates replay Zstandard decompression in two code paths to use frame content size when available and fall back to stream copying otherwise. It also bumps the zstd-jni dependency version.

Changes

Zstd replay decompression

Layer / File(s)	Summary
Bounded Zstd copy `data/src/main/java/com/faforever/commons/replay/ReplayDataParser.java`, `data/src/main/java/com/faforever/commons/replay/ReplayLoader.java`, `gradle/libs.versions.toml`	Both replay decompression paths now check `Zstd.getFrameContentSize(...)` and copy a bounded number of bytes when the frame size is known, otherwise they copy until EOF; the `zstd-jni` version is updated.

🎯 2 (Simple) | ⏱️ ~10 minutes

🐇 I hopped through bytes with careful flair,
Measured each frame with tiny rabbit care.
When zstd trails left crumbs in the stream,
I copied just right—what a tidy dream!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: fixing Zstd replay decompression when trailing bytes follow the frame.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/zstd-trailing-bytes-decompression

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

🧹 Nitpick comments (1)

data/src/main/java/com/faforever/commons/replay/ReplayLoader.java (1)
109-140: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoff

Consider extracting the shared decompress logic.

This decompress method (including the ZSTD bounded-copy block) is duplicated verbatim in ReplayDataParser. The PR had to edit both copies in lockstep to keep them aligned, which is exactly the maintenance hazard duplication creates. Extracting it into a shared helper (e.g. a ReplayCompression utility) would prevent the two paths from drifting.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@data/src/main/java/com/faforever/commons/replay/ReplayLoader.java` around
lines 109 - 140, The decompression logic is duplicated between
ReplayLoader.decompress and ReplayDataParser, including the ZSTD bounded-copy
handling, so changes must be made in two places. Extract this logic into a
shared helper utility (for example, a ReplayCompression class/method) and have
both callers delegate to it. Keep the existing behavior for QTCOMPRESS, ZSTD,
and UNKNOWN in the shared implementation so both paths stay aligned.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@data/src/main/java/com/faforever/commons/replay/ReplayLoader.java`:
- Around line 109-140: The decompression logic is duplicated between
ReplayLoader.decompress and ReplayDataParser, including the ZSTD bounded-copy
handling, so changes must be made in two places. Extract this logic into a
shared helper utility (for example, a ReplayCompression class/method) and have
both callers delegate to it. Keep the existing behavior for QTCOMPRESS, ZSTD,
and UNKNOWN in the shared implementation so both paths stay aligned.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 57d87661-444d-499b-99e0-ddd1753c1f00

📥 Commits

Reviewing files that changed from the base of the PR and between 2e8a5a8 and 3a662ca.

📒 Files selected for processing (3)

data/src/main/java/com/faforever/commons/replay/ReplayDataParser.java
data/src/main/java/com/faforever/commons/replay/ReplayLoader.java
gradle/libs.versions.toml

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

Brutus5000 merged commit 70f1098 into develop Jun 26, 2026
2 checks passed

Brutus5000 deleted the fix/zstd-trailing-bytes-decompression branch June 26, 2026 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix zstd decompression of replays with trailing bytes#281

Fix zstd decompression of replays with trailing bytes#281
Brutus5000 merged 1 commit into
developfrom
fix/zstd-trailing-bytes-decompression

Brutus5000 commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Brutus5000 commented Jun 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root cause

Fix

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Brutus5000 commented Jun 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading