feat(editor): tree-sitter syntax highlighting for .qmd in Monaco#296
Merged
Conversation
Member
Author
Member
|
I particularly like the calls to the R and Python tree-sitter grammars for their own highlighting! |
Member
|
I don't think we should highlight comments like that, though; they should be, well, comments! |
905407d to
b2b1292
Compare
Member
Author
I assume you mean comments within the qmd - those are now fixed in b2dbd3c and appear grey+italic. We weren't emitting any |
New quarto-lsp-core::tokens extractor (qmd structure + frontmatter YAML + code-cell interiors) exposed via lsp_get_semantic_tokens, wired into a Monaco DocumentSemanticTokensProvider with a Monarch base and quarto-{light,dark} themes. Phase 0 switches the render producer and both user-grammar paths (native + JS) onto a shared Query.captures()+flatten_spans resolver, fixing bd-98k6.
Snapshots: 4 quarto-highlight goldens regenerated (bash/julia/python/user_grammar_toml) — nested->flat reshape; per-byte colour preserved for bash/julia/python, user_grammar_toml shows the bd-98k6 type=[0,4] fix. 1 new quarto-lsp-core structural_corpus snapshot.
…ncoloured) Only the opening [ and closing ) of a link/image are queryable; the closing ] and opening ( are fused into the anonymous ]( token in target, so colouring just [/) left the closing ] a different colour. Drop the bracket captures so all link/image bracket punctuation uses the default foreground. qmd.punctuation.bracket stays a reserved legend/theme entry.
HTML `<!-- -->` and editorial `[>> ...]` comments now emit a qmd.markup.comment semantic token (muted italic) via a new highlights.scm capture, legend entry (Rust + TS), and theme rule. Snapshots: 1 modified (structural_corpus, +2 lines) — the two corpus comments now resolve to qmd.markup.comment; no other tokens changed.
…file disposeIntelligenceProviders() was wired into an effect keyed on a currentFile-dependent callback, so it fired on every currentFile change rather than on unmount as its comment claimed. Move the register/dispose lifecycle into useIntelligenceProviders (register on mount, dispose on unmount). Latent cleanup — no confirmed user-visible change.
The semantic-tokens provider read the VFS image by path while Monaco renders the model; when the two drift (e.g. a remote edit deferred while the tab is hidden), token positions shift and colours smear onto adjacent characters — an image-opener token landing on a link's [ , etc. Tokenise model.getValue() instead, and drop the Monarch base rule that coloured the opening [ but not ] so link brackets render uniformly where the base shows through.
b2dbd3c to
0b6352f
Compare
Member
Author
|
There are some follow-ups, but they are not blocking for this PR: 1. {.python} chunks, 2. yaml options in code chunks etc. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Fixes #10.
.qmdfiles were mapped to Monaco's built-inmarkdownlanguage, which mis-colours links, ignores YAML frontmatter, and treats Quarto's{r}/{python}cells and Pandoc constructs (attributes, shortcodes, math) as plain text.This drives
.qmdeditor highlighting from tree-sitter instead.The editor requests semantic tokens from a Rust→WASM module. Rust parses the buffer with the
tree-sitter-qmdgrammar and returns(line, char, length, type)tokens for three regions — Markdown structure, YAML frontmatter, and code-cell interiors (R/Python/… via the same grammars the renderer uses) — which Monaco colours.A synchronous Monarch base paints instantly while the async tokens arrive, and fills anything the tokens don't cover.
Key files:
crates/quarto-lsp-core/src/tokens.rs— extractor (parse tree → per-region, per-line UTF-16 tokens);types.rsholds the token legend + capture-name → colour-name translator.crates/quarto-highlight/src/captures.rs— resolver (Query.captures()+ innermost-winsflatten_spans) now shared by the editor and the HTML renderer, so code cells colour identically in both. Also fixes bd-98k6; see the 4 regenerated*.snapgoldens (nested → flat).crates/wasm-quarto-hub-client/src/lib.rs—lsp_get_semantic_tokens/lsp_get_token_legendexports, modelled on the existinglsp_get_symbols.hub-client/src/services/monacoProviders.ts,components/quartoTheme.ts— Monaco provider,quarto-{light,dark}themes, Monarch base.Code-cell colours mirror the rendered
hl-*CSS (a test locks the two tables together); every theme rule carries aqmd.prefix so it can't recolour other languages in the editor.Design + phase notes:
claude-notes/plans/2026-06-10-monaco-tree-sitter-highlighting.md.