[ExecuTorch][WebGPU] Add clone op (aten.clone.default)#20463
[ExecuTorch][WebGPU] Add clone op (aten.clone.default)#20463JulianCloudNTH wants to merge 1 commit into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20463
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 3 New Failures, 3 Unrelated FailuresAs of commit a923c16 with merge base 1b726b2 ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
Stack from ghstack (oldest at bottom):
aten.clone.defaultis a pure flat copy on the buffer-only WebGPU backend, identical toview_copy:clone_implreuses the existingadd_flat_copyhelper (output[i] = input[i]) and registers a handler underaten.clone.default. No new shader, generated WGSL header, or CMake source — it shares theview_copyflat-copy compute pipeline. Required for end-to-end Llama 3.2 1B (4-bit, KV cache): the exported model serializes 2aten.clone.defaultops into its runtime operator chain (the RoPE-frequency clones reused across all 16 transformer layers), so without a handler the partition graph-breaks at those nodes. Mirrors the Vulkan delegate, which registers the same op and routes a buffer clone to a flat view-copy.Differential Revision: D109477717