docs(roadmap): Phase 進捗の反映と新規 Phase 追記#249
Open
send wants to merge 1 commit into
Open
Conversation
- Phase 1 (Rewriter): 実装済み 5 rewriter + run_rewriters() を反映 ✅ - Phase 2 (POS 文節分割 + structure_cost): 実装内容を反映、残課題明記 ✅ - Phase 2.5 (スニペット): SnippetStore / VariableResolver を新規追加 ✅ - Phase 3: IT/業務辞書構築の動機・方針を追記 - Phase 4: karukan Adaptive Strategy を参考に追加 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
docs/ime-research.md のロードマップ節を、現状の実装状況(Phase 1/2/2.5 の到達点や Phase 3/4 の方針)に合わせて更新するドキュメント変更です。コード変更はなく、研究/設計メモとしての「今どこまでできていて次に何をするか」を読みやすく整理しています。
Changes:
- Phase 1(Rewriter パイプライン)を実装済み前提で具体的な rewriter 群と実行フローまで明記
- Phase 2(POS 文節分割 + structure_cost)を「ほぼ完了」として、実装済み要素と残課題を明確化
- Phase 2.5(スニペット機能)を新規 Phase として追記し、関連コンポーネントを記載
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+148
to
+152
| - NumericRewriter: ひらがな数字→漢数字/半角/全角 (最大 ~10^16) | ||
| - KatakanaRewriter: 全文カタカナ候補 | ||
| - HiraganaVariantRewriter: 漢字セグメント→ひらがな置換 | ||
| - PartialHiraganaRewriter: Top-5 パスの個別セグメント単位でひらがな置換 | ||
| - KanjiVariantRewriter: ひらがなセグメント(2文字)→漢字代替案 |
| - HiraganaVariantRewriter: 漢字セグメント→ひらがな置換 | ||
| - PartialHiraganaRewriter: Top-5 パスの個別セグメント単位でひらがな置換 | ||
| - KanjiVariantRewriter: ひらがなセグメント(2文字)→漢字代替案 | ||
| - run_rewriters() で順次実行、重複排除 |
Comment on lines
+161
to
+165
| - structure_cost: 遷移コスト集約 + ハードフィルタ + ソフトペナルティ (Mozc インスパイア) | ||
| - group_segments(): POS role ベースの形態素→句グループ化 (接尾辞/関数語マージ、接頭辞処理) | ||
| - resegment(): ラティスノードを使った代替分割案生成 (最大10パス) | ||
| - per-segment penalties: 非自立漢字、代名詞ボーナス、て形漢字、人名、単漢字内容語 | ||
| - length_variance: 3セグメント以上のパスで不均等分割にペナルティ |
Comment on lines
+169
to
+170
| - SnippetStore: HashMap ベース、prefix_search、TOML 設定 | ||
| - VariableResolver: $varname / ${varname} 展開、未定義変数検証 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
概要
docs/ime-research.mdのロードマップ節を実装状況に合わせて更新。ドキュメントのみの変更(コード変更なし)。変更内容
run_rewriters()を反映group_segments()、resegment()、per-segment penalties、length_variance を反映。残課題(segmenter.def 相当の分割点補正)を明記テストプラン
🤖 Generated with Claude Code