Skip to content

feat: add installer ISO + reorganize image modules#21

Open
phorcys420 wants to merge 16 commits into
mainfrom
phorcys/installer-iso
Open

feat: add installer ISO + reorganize image modules#21
phorcys420 wants to merge 16 commits into
mainfrom
phorcys/installer-iso

Conversation

@phorcys420

@phorcys420 phorcys420 commented Jun 10, 2026

Copy link
Copy Markdown
Member

Closes #4

Summary

Reorganizes the prebuilt-image modules under nixos/_images/ (with a shared _base/ of primitives) and adds a new installer ISO host.

nixos/ layout

nixos/_images/
  box-turnkey.nix      # moved from nixos/_appliance/; shared turn-key Coder box
  _base/
    hardware.nix       # all-hardware (single source)
    iso.nix            # shared ISO mechanics — no image identity
  _appliance/iso.nix   # was live-iso.nix; appliance identity only
  _installer/iso.nix   # NEW; installer identity (mirrors appliance for now)
  • _base/ holds primitives only; box-turnkey.nix is the shared payload one level up, and imports _base/hardware.nix so the disk host (no iso.nix) still gets all-hardware.

New _installer-iso host

  • hosts/_installer-iso/ builds ISO only (no qcow2/raw).
  • GUI on for now: it keeps the full Coder box (imports configuration.nix via the unchanged mkHost, plus box-turnkey for the coderbox autologin). Differs from the appliance only in image identity (CODER_BOX_INSTALLER, coder-box-installer-<arch>.iso).
  • The minimal/bash-only installer environment and the actual "install onto hardware" workflow are deferred.

Other

  • Appliance isoImage.volumeID changed CODER_BOX_LIVECODER_BOX_APPLIANCE.
  • Makefile: added installer/iso + installer/iso/<arch> (reuses box_build).
  • flake.nix: comment-only.
  • Docs: README.md + agents.md updated for the _images/ layout and installer ISO.

Validation (real Nix)

  • nix flake show: 5 hosts incl. _installer-iso.
  • Installer ISO built end-to-end → out/installer-iso/iso/coder-box-installer-x86_64-linux.iso (4.1 GB); asserts: volumeID=CODER_BOX_INSTALLER, xserver.enable=true, coder admin env present, autologin coderbox, hostname coder-box.
  • Appliance ISO/qcow2/raw still evaluate (volumeID=CODER_BOX_APPLIANCE).
  • Install-host regression: coder-thinkcentre and qemu-arm64 toplevel drv hashes byte-identical to origin/main — no install-flow impact.
  • make -n for all five targets resolves to the correct hosts.

@phorcys420 phorcys420 marked this pull request as draft June 10, 2026 13:33
@phorcys420

Copy link
Copy Markdown
Member Author
docker run --rm -it --privileged \
  -v "$PWD":/work -w /work \
  -e TMPDIR=/work/tmp \
  -e NIX_CONFIG="experimental-features = nix-command flakes" \
  nixos/nix "nix-shell -p gnumake"

…nstaller}; add installer ISO

Reorganize the prebuilt-image modules and add a new installer ISO host.

nixos/ layout:
- nixos/_images/_base/hardware.nix  — all-hardware (single source)
- nixos/_images/_base/iso.nix       — shared ISO mechanics (iso-image.nix import,
  EFI/BIOS/USB bootable, bootloader overrides); NO image identity
- nixos/_images/box-turnkey.nix     — moved from nixos/_appliance/; shared turn-key
  Coder box (bake /etc/nixos-repo, registry, coderbox autologin, coder admin env);
  now imports _base/hardware.nix and is shared by all image hosts
- nixos/_images/_appliance/iso.nix  — was nixos/_appliance/live-iso.nix; now only
  sets appliance image identity (volumeID CODER_BOX_APPLIANCE, menu label, baseName)
- nixos/_images/_installer/iso.nix  — NEW; mirrors the appliance ISO (full GUI box +
  turn-key) but with installer identity (CODER_BOX_INSTALLER / coder-box-installer-<arch>)

hosts/:
- _appliance_iso, _appliance-disk: import paths updated to nixos/_images/...
- _installer-iso: NEW host (ISO only)

Makefile: add installer/iso[/<arch>] targets (reuses box_build).
flake.nix: comment-only (installer keeps configuration.nix → GUI on).
Docs: README + agents.md updated for the _images/ layout and installer ISO.

The installer is intentionally GUI-on for now; the minimal/bash-only installer
environment is deferred.

Verified (nix): all 5 hosts eval; installer ISO builds end-to-end (4.1G,
out/installer-iso/iso/coder-box-installer-x86_64-linux.iso) with volumeID
CODER_BOX_INSTALLER, xserver enabled, coder admin env, autologin coderbox,
hostname coder-box; appliance ISO/qcow2/raw still eval (volumeID
CODER_BOX_APPLIANCE); coder-thinkcentre & qemu-arm64 toplevel drv hashes
byte-identical to origin/main (no install-flow regression).
…ance/_installer -> base/appliance/installer)

The _images parent already signals 'not a host'; the subfolders don't need the
underscore. Rename the three subdirs and update all import paths and doc/comment
references. No behavior change.

Verified: all 5 hosts eval; installer + appliance ISO derivations still build.
box-turnkey autologins into the Plasma desktop; for the installer ISO add a
system-wide XDG autostart entry that opens Konsole full-screen on session start
with --workdir /etc/nixos-repo (the baked coder/box repo). Installer-only;
the appliance keeps a normal desktop.
Login is already passwordless (coderbox autologin + passwordless sudo); the
only remaining password gate was KDE's screen locker (idle auto-lock /
lock-on-resume). Disable it system-wide for the installer via kscreenlockerrc
(Autolock=false, LockOnResume=false) so the installer is never locked. Appliance
keeps the default locker.
…into writable git repo)

On the installer/appliance ISO, /etc/nixos-repo is a symlink into the read-only
Nix store with no .git, so install.sh's writes (mkdir hosts/<host>) failed with
'Read-only file system', and there was no git repo to leave on the installed
box. Fix:

- install.sh: when REPO_DIR is read-only or not a git repo, clone the upstream
  repo (CODER_BOX_REPO_URL / CODER_BOX_REPO_REF, default public coder/box main)
  into a writable tmpdir (tmpfs/RAM on the live ISO) and re-exec from there. The
  clone is a real git repo (origin + tracking branch), so after cp -a to
  /mnt the installed /etc/nixos-repo supports 'git pull'. The normal live-USB
  flow (already a writable git clone) is unchanged. git add --intent-to-add is
  now unconditional since REPO_DIR is always a git repo.
- nixos/_images/installer/iso.nix: set vm.overcommit_memory=1 (mirrors nixpkgs
  installation-device.nix) for low-RAM installs; document why install.sh works
  on the read-only ISO (writable tmpfs /nix/store overlay; closure builds into
  the target /mnt/nix/store).

Verified: clone-relocation triggers on a read-only no-.git dir and re-execs;
normal writable git repo does not clone; installer ISO still evaluates.
@phorcys420 phorcys420 force-pushed the phorcys/installer-iso branch from 0eaafcf to 22ecf1e Compare June 10, 2026 18:16
… git pull

Per request, don't re-clone from the network when install.sh runs on the ISO —
the full repo is already baked into the image. When REPO_DIR is read-only / not
a git repo, copy the baked tree into a writable tmpdir, 'git init' it locally
(commit a snapshot labeled with the baked upstream rev), and set origin +
branch.main.merge so the installed /etc/nixos-repo can 'git pull' future updates.
No network needed at install time. CODER_BOX_REPO_URL only sets the origin URL.

Record provenance: box-turnkey.nix writes /etc/coder-box-rev from self.rev (or
dirtyRev), which install.sh uses to label the initial commit.

Verified: offline copy+init path materializes a git repo with origin and
tracking branch (no clone); installer ISO still evaluates; rev file populated.
…ll), offline fallback

Mirror the manual 'copy /etc/nixos-repo somewhere writable, make it git, run
install.sh there' workflow precisely: copy the baked repo (no full re-download),
git init + add origin, then — if online — 'git fetch' upstream and 'reset --hard'
to it so the installed /etc/nixos-repo has real upstream history and future
'git pull' is clean (no unrelated-histories reconcile). Offline / fetch failure
falls back to the baked snapshot commit (install still works; first pull may
need --rebase). A fetch transfers deltas, not a full clone.

Verified both paths locally against a fake upstream (online anchors + sets
tracking branch; offline keeps snapshot); installer ISO still evaluates.
…wizardry

Per feedback:
- Drop the .coder-box-write-test probe; detect read-only REPO_DIR with [[ ! -w ]].
  On NixOS /etc/nixos-repo is a read-only *mount*, so access(W_OK)=EROFS makes
  -w false even for root.
- Remove all git init/commit/remote-add/fetch logic (it failed: the baked repo
  already has a .git with origin, so 'git remote add origin' errored 'already
  exists'). Now we just 'cp -a' the baked repo to a writable tmpdir and re-exec.
  cp -a keeps the baked .git (if present), so the installed /etc/nixos-repo can
  still 'git pull'; if absent, install still works as a path flake.
- Guard the intent-to-add (skip when the copy has no .git).
- Remove the now-unused /etc/coder-box-rev from box-turnkey.nix.

Verified: relocation copies + re-execs cleanly with and without a baked .git
(no 'origin already exists'/'re-init' noise); installer ISO still evaluates.
…ole; drop to bash on failure

- install.sh: list_disks now skips /dev/zram* (and dm-/md/loop) — zram reports
  TYPE=disk RM=0 so it was wrongly offered as an install target.
- installer iso.nix: the preopened full-screen Konsole now runs a launcher
  (konsole -e) that cd's to /etc/nixos-repo and runs 'sudo ./install.sh'
  (passwordless sudo), then execs an interactive bash shell if install.sh
  fails — so a failed install leaves the user at a usable prompt instead of a
  dead terminal. On success install.sh reboots.

Verified: zram filter keeps vda/sda, drops zram/loop/rom; installer ISO
evaluates; Konsole Exec runs the launcher; bash -n clean.
…ate failure on the box ISO)

When installing FROM a Coder box image (installer/appliance ISO), the target's
system closure already exists in the live /nix/store — but in the read-only
squashfs lower layer of the overlay. nixos-install builds the flake with
'nix build --store /mnt --extra-substituters <host-store>', i.e. it substitutes
the closure into /mnt from the host store; the squashfs paths lack the
signatures/narinfo substitution needs, so the copy silently yields nothing.
/mnt is left empty (no bash, system profile points at nixos-repo-src) and the
chroot 'activate' fails: 'No such file or directory'. (Confirmed on-device:
/mnt/nix/store had no bash, system-1-link -> ...-nixos-repo-src, /mnt 34M used.)

Fix: when the target closure is already present in the live store, copy it into
/mnt explicitly with 'nix copy --no-check-sigs' (bypasses the
signature/substituter machinery); nixos-install then finds it and just
activates. On a stock NixOS live ISO the closure isn't present, so we skip the
copy and let nixos-install build straight into /mnt as before (no extra
tmpfs/RAM use). Pure 'nix eval' (no build) gets the path so the stock-ISO case
isn't forced to realise the closure in the host store.

Verified: bash -n clean; nix eval --raw returns the toplevel out path for the
presence check. Needs on-device confirmation on the installer ISO.
@phorcys420

Copy link
Copy Markdown
Member Author

todo handle wifi

…al fix for chroot activate failure)

My previous attempt gated the nix-copy on the target toplevel already being
realised in the live store. But install.sh installs host 'coder-nixos', which
is a DIFFERENT system than the image's own host (_installer-iso), so its
toplevel is never pre-realised — the check failed, the copy was skipped, and
nixos-install fell back to the broken squashfs substitution. That's why the
auto-relocated install failed while the user's manual 'copy to /tmp/box and run'
worked.

Fix: on a baked image (relocation now exports CODER_BOX_FROM_IMAGE=1),
explicitly build the coder-nixos toplevel (cheap — KDE/Coder/k3s/etc. are reused
from the squashfs; only host-specific derivations are new), 'nix copy
--no-check-sigs' its full closure into /mnt, then 'nixos-install --system'
activates it. The squashfs-substitution-into-/mnt path (which silently no-ops on
signature-less store paths) is no longer used. Stock live-USB installs keep the
original 'nixos-install --flake' (direct build into /mnt; no tmpfs OOM).

Verified: bash -n clean; nixos-install --system/--no-channel-copy/--no-root-passwd
all valid in the pinned nixpkgs. Needs on-device confirmation.
- nixos/_images/installer/iso.nix: take 'self', derive boxRev (self.rev /
  dirtyRev / unknown) and a 12-char short form; append the short rev to the
  boot-menu label -> 'NixOS <ver> - Coder Box Installer (<rev>)'. Also write the
  full rev to /etc/coder-box-rev (the baked /etc/nixos-repo has no .git, so
  install.sh can't get it from git on the image).
- install.sh: print '  revision: <rev>' under the banner. Resolves via git
  (live-USB clone / fork checkout), else /etc/coder-box-rev, else 'unknown'.

Verified: installer ISO evaluates; menu label = ' - Coder Box Installer
(7ac3deb)'; /etc/coder-box-rev populated; bash -n clean.
The installer ISO inherits the full box config but its only job is to install
coder/box onto a disk — running the Coder server, k3s, PostgreSQL, Podman, and
the bootstrap/redirect/reaper/logstream units + template-sync activation in the
live environment is dead weight (slow boot, wasted RAM/CPU during install).
Disable them in the installer module:
- services.coder-nixos.k3s-sysbox / postgresql / podman -> mkForce false
- systemd coder, coder-init-admin, coder-redirect, coder-logstream-kube,
  coder-workspace-reaper (+ timer), coder-sync-ssh-keys -> enable = false
- coder-template-sync activation script -> mkForce ""
The INSTALLED system still gets everything; this only affects the live ISO.

Verified: installer ISO evaluates; all listed services enable=false (k3s off);
appliance still has them all enabled; coder-thinkcentre toplevel drv unchanged.
The boot menu showed '(unknown)' because the Makefile builds through a *path*
flakeref — getFlake (toString ./.) — which carries NO git metadata, so
self.rev/dirtyRev were empty (a previous _module.args attempt also threw
'attribute coderBoxRev missing' under .# builds because the formal-arg default
isn't honored).

Add a proper NixOS option 'coderBox.rev' in the shared box-turnkey.nix
(defined there so it exists for every image host the Makefile builds), defaulting
to self.rev/dirtyRev for .# (git+file) builds. The Makefile computes the git rev
(GIT_REV, full hash + -dirty) and overrides it via 'coderBox.rev' on every
target. box-turnkey's config is now wrapped in 'config = { }' as required when a
module declares options.

Verified: installer boot label shows (78ac95f) via both Makefile-style and
.# builds; appliance ISO + disk still evaluate with coderBox.rev set; all 5
hosts visible.
Revert the *.tar.gz ignore added in cc5670d: the only tarball in the tree is the
build-critical pkgs/sysbox-runc-vendor-*.tar.gz vendor source, and tarballs here
are intentionally committed, so a blanket ignore is wrong. The hosts/ ignore
(user hosts vs centrally-managed) stays.
@phorcys420 phorcys420 changed the title Add installer ISO + reorganize image modules under nixos/_images/ feat: add installer ISO + reorganize image modules Jun 11, 2026
@phorcys420 phorcys420 marked this pull request as ready for review June 11, 2026 10:03

@bpmct bpmct left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM. Building the installer ISO now and will try on my machine when I get home :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add Live installer ISO

2 participants