Skip to content

fix(skills): preserve non-ASCII characters in skill frontmatter#2917

Open
seiya-koji wants to merge 2 commits into
github:mainfrom
seiya-koji:fix/skill-frontmatter-allow-unicode
Open

fix(skills): preserve non-ASCII characters in skill frontmatter#2917
seiya-koji wants to merge 2 commits into
github:mainfrom
seiya-koji:fix/skill-frontmatter-allow-unicode

Conversation

@seiya-koji

Copy link
Copy Markdown
Contributor

Description

Skill SKILL.md frontmatter description values with non-ASCII characters were
escaped to \uXXXX / \xXX, because the yaml.safe_dump(...) calls that render
skill frontmatter were missing allow_unicode=True.

# Before
description: "Pr\xFCfe Konformit\xE4t der Implementierung"
# After
description: "Prüfe Konformität der Implementierung"

#1936 fixed this for the YAML I/O that existed then; the skill frontmatter paths
added later (native skills migration) regressed it.

This PR adds allow_unicode=True to the 7 skill/command frontmatter safe_dump
sites (extensions.py, presets.py, integrations/claude/__init__.py) and adds
regression tests for the render and extension-install paths.

Testing

Added regression tests for both frontmatter paths (render and extension-install),
ran the full suite (3718 passed, 45 skipped), and confirmed the fix end-to-end on a
sample project — a unicode extension's description (CJK + accented Latin + emoji)
survived intact in the generated SKILL.md.

  • Tested locally with uv run specify --help
  • Ran existing tests with uv sync && uv run pytest
  • Tested with a sample project (if applicable)

AI Disclosure

  • I did not use AI assistance for this contribution
  • I did use AI assistance (describe below)

Implemented with Claude Code (Opus 4.8).

Skill SKILL.md frontmatter descriptions containing non-ASCII
characters were escaped to \uXXXX / \xXX sequences because
yaml.safe_dump() was called without allow_unicode=True.

- Add allow_unicode=True to the 7 skill/command frontmatter
  safe_dump sites (extensions, presets, claude integration)
- Add regression tests for the render and extension-install paths

Follows the approach of github#1936; encoding="utf-8" is already set on
the affected write paths, so no encoding change is needed here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@seiya-koji seiya-koji requested a review from mnriem as a code owner June 10, 2026 15:37
Copilot AI review requested due to automatic review settings June 10, 2026 15:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR ensures non-ASCII (Unicode) characters are preserved when generating/rendering SKILL.md YAML frontmatter by enabling Unicode output in YAML serialization, and adds regression tests to prevent reintroducing escaped output.

Changes:

  • Enable allow_unicode=True in multiple yaml.safe_dump(...) calls that generate SKILL.md frontmatter.
  • Add a new extension-skills test that installs an extension with a Unicode description and asserts SKILL.md preserves it.
  • Add an integration test ensuring Claude skill rendering preserves Unicode characters.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_extension_skills.py Adds a Unicode-focused extension fixture + SKILL.md assertion to prevent escaped descriptions.
tests/integrations/test_integration_claude.py Adds a regression test for Unicode preservation in rendered skills.
src/specify_cli/presets.py Enables Unicode output when dumping YAML frontmatter for preset/skill reconciliation flows.
src/specify_cli/integrations/claude/init.py Enables Unicode output when dumping YAML frontmatter during skill rendering.
src/specify_cli/extensions.py Enables Unicode output when dumping YAML frontmatter for extension skill registration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test_extension_skills.py Outdated
Comment on lines +148 to +149
with open(ext_dir / "extension.yml", "w", encoding="utf-8") as f:
yaml.dump(manifest_data, f, allow_unicode=True)
Comment thread src/specify_cli/presets.py Outdated
f"override:{cmd_name}",
)
fm_text = yaml.safe_dump(fm_data, sort_keys=False).strip()
fm_text = yaml.safe_dump(fm_data, sort_keys=False, allow_unicode=True).strip()
Centralize skill/command frontmatter YAML serialization into a single
_utils.dump_frontmatter helper so no call site can drop allow_unicode or
diverge on formatting. Route the 7 existing sites through it and drop a
now-unused local yaml import.

Switch the extension test fixtures to yaml.safe_dump for parity with the
production safe-dump/safe-load codepaths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@seiya-koji

Copy link
Copy Markdown
Contributor Author

Addressed Copilot's refactoring suggestion in ae74205. It reaches a few more call sites than the original fix, so the diff is a bit larger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants