nix-config

Author	SHA1	Message	Date
jasmine	d21b36a1b0	fix(borgbackup): remove persistent timers to prevent post-resume failures Removes persistentTimer from all borgbackup services and unnecessary network-online.target dependencies. Changes fuchsia offsite to 14:00 fixed schedule when system is reliably awake. Persistent timer catch-ups immediately after system resume caused failures due to services starting before network/system fully stabilized: - Onsite: DNS resolution failures (viridian.home.arpa) - Offsite: BorgBase connection refusals during SSH/borg handshake Fixed schedules provide reliable backups without catch-up complexity: - fuchsia offsite: 14:00 daily (typical awake time for desktop) - viridian offsite: midnight daily (always-on server) - All onsite: hourly (no catch-up needed) Offsite services retain wants/after dependencies on onsite completion to prevent race conditions on shared /btrfs-subvolumes snapshot paths. Network dependencies removed as fixed schedules run when system is already stable, eliminating timing issues with network-online.target.	2025-10-14 09:59:27 +08:00
jasmine	4389500ccc	fix(borgbackup): add network dependencies to onsite services Fixes DNS resolution failures when persistent timers trigger backups after system wake. The NixOS borgbackup module adds network-online.target dependencies to the timer when persistentTimer=true, but systemd timers don't pass their dependencies to the services they trigger. This caused onsite backups to start before the network was ready, resulting in "Could not resolve hostname" errors. Adding after/wants network-online.target directly to the service ensures the backup waits for network availability regardless of how it's triggered (timer or offsite's Wants= dependency). Example failure (Oct 11, 07:43): - Backup started at 07:43:43 (persistent timer caught up) - DNS lookup failed: "Could not resolve hostname viridian.home.arpa" - WiFi connected at 07:43:47 (4 seconds too late) Applied to both fuchsia and viridian onsite backups.	2025-10-11 08:06:00 +08:00
jasmine	a276fdf53a	fix(borgbackup): prevent race conditions and improve reliability Fixes multiple issues with borgbackup service coordination: 1. Race condition between onsite/offsite backups - Set Type=oneshot to ensure services wait for completion - Added Wants= dependency to trigger onsite when offsite runs - Prevents snapshot path collision at /btrfs-subvolumes 2. Network unavailability after sleep/wake - Added persistentTimer=true to onsite backups - NixOS module now auto-adds network-online.target dependencies - Fixes DNS resolution failures for SSH repos 3. Data loss risk from missed backups - Persistent timers ensure backups run on wake if missed - Protects work done before sleep from being unbackored 4. Duplicate onsite runs at midnight - Removed 15-minute stagger (00:15 -> 00:00) - Systemd deduplicates services in same transaction - Onsite now runs once, not twice Applied to both fuchsia and viridian for consistency.	2025-10-09 11:30:00 +08:00
jasmine	15b4851e8e	refactor(borgbackup): implement shared staging with defense-in-depth Major improvements to borgbackup configuration for better reliability and maintainability: Shared staging directory: - Use single /btrfs-subvolumes directory (was /subvolumes-{onsite,offsite}) - Eliminates redundant path suffixes in archive structure - Archive paths now semantic: /btrfs-subvolumes/srv-forgejo clearly indicates BTRFS subvolume content without redundant backup job metadata Defense-in-depth protection: - Layer 1: Systemd ordering - offsite waits for onsite completion - Layer 2: Self-healing preHook - auto-cleanup orphaned snapshots from crashes/power loss - Prevents cascading failures from race conditions or abnormal terminations Code quality improvements: - Extract subvolume lists to reduce duplication (DRY principle) - Add /* sh / syntax hints for proper editor highlighting - Silent operation for consistency with existing hooks - Improved readability with clearer comments and formatting - All lines ≤ 100 characters Timing:* - Offsite: --* 00:15:00 (daily at 12:15 AM, waits for onsite) - Onsite: hourly (unchanged)	2025-10-08 18:46:50 +08:00
jasmine	37924375a2	refactor(borgbackup): backup from /persist paths instead of bind mounts Update backup paths to use actual persistent storage locations (/persist/*) rather than bind-mounted paths, making it clear where data truly resides and simplifying restore operations.	2025-10-08 15:58:23 +08:00
jasmine	26c08000a0	refactor(borgbackup): use visible directories with semantic subvolume names Changes staging directories from hidden to visible and aligns backup paths with actual BTRFS subvolume naming conventions for better clarity when browsing archives.	2025-10-08 15:53:15 +08:00
jasmine	f24a7476a7	feat(viridian): add explicit persist data to backup strategy Add critical system state from persist.nix to borgbackup jobs: - SSH host keys (required for borg authentication) - machine-id and nixos state - Network and bluetooth configurations Paths mirror persist.nix configuration for maintainability. Service-specific persist data (traefik, crowdsec) excluded - will create dedicated subvolumes if/when needed.	2025-10-07 17:06:45 +08:00
jasmine	7833d89d86	fix(viridian): resolve backup system initialization issues Fix snapper and borgbackup jobs to work with ephemeral-btrfs setup: Snapper fixes: - Remove global /.snapshots mount (use nested subvolumes instead) - Remove unused hostname variable - Snapshots now stored in .snapshots subvolumes within each service Borgbackup fixes: - Add systemd.tmpfiles.rules to create staging directories at boot - Add readWritePaths for staging directories (systemd sandboxing) - Staging directories survive ephemeral root wipes Architecture notes: - Nested .snapshots subvolumes don't require separate mounts - systemd tmpfiles ensures directories exist before services start - ProtectSystem=strict requires explicit ReadWritePaths allowlist	2025-10-07 09:38:07 +08:00
jasmine	c05598d9e0	feat(viridian): implement comprehensive 3-2-1 backup strategy Add automated snapshot and backup system with three independent tiers: Snapper (hourly local snapshots): - Configure snapper for all srv-* subvolumes - Tiered retention: 24 hourly, 7 daily, 4 weekly, 12 monthly - Snapshots stored at /.snapshots on viridian drive - Provides fast operational rollback for user errors Borgbackup onsite (hourly local backups): - Independent staging snapshots at /.staging-onsite - Repository on data drive at /srv/borg-repo - Unencrypted (physical security assumed) - Matches snapper retention policy - Fast local disaster recovery Borgbackup offsite (daily remote backups): - Independent staging snapshots at /.staging-offsite - Encrypted backups to borgbase repository - Retention: 7 daily, 4 weekly, 12 monthly - Remote disaster recovery with prune policy Architecture decisions: - Separate staging directories prevent job conflicts - Staging snapshots decouple borg jobs from snapper - Consistent zstd,9 compression across both borg jobs - Special case handling for containers subvolume path	2025-10-06 20:59:26 +08:00

9 commits