Commit graph

182 commits

Author SHA1 Message Date
4389500ccc
fix(borgbackup): add network dependencies to onsite services
Fixes DNS resolution failures when persistent timers trigger backups
after system wake.

The NixOS borgbackup module adds network-online.target dependencies
to the timer when persistentTimer=true, but systemd timers don't pass
their dependencies to the services they trigger. This caused onsite
backups to start before the network was ready, resulting in "Could not
resolve hostname" errors.

Adding after/wants network-online.target directly to the service
ensures the backup waits for network availability regardless of how
it's triggered (timer or offsite's Wants= dependency).

Example failure (Oct 11, 07:43):
- Backup started at 07:43:43 (persistent timer caught up)
- DNS lookup failed: "Could not resolve hostname viridian.home.arpa"
- WiFi connected at 07:43:47 (4 seconds too late)

Applied to both fuchsia and viridian onsite backups.
2025-10-11 08:06:00 +08:00
a276fdf53a
fix(borgbackup): prevent race conditions and improve reliability
Fixes multiple issues with borgbackup service coordination:

1. Race condition between onsite/offsite backups
   - Set Type=oneshot to ensure services wait for completion
   - Added Wants= dependency to trigger onsite when offsite runs
   - Prevents snapshot path collision at /btrfs-subvolumes

2. Network unavailability after sleep/wake
   - Added persistentTimer=true to onsite backups
   - NixOS module now auto-adds network-online.target dependencies
   - Fixes DNS resolution failures for SSH repos

3. Data loss risk from missed backups
   - Persistent timers ensure backups run on wake if missed
   - Protects work done before sleep from being unbackored

4. Duplicate onsite runs at midnight
   - Removed 15-minute stagger (00:15 -> 00:00)
   - Systemd deduplicates services in same transaction
   - Onsite now runs once, not twice

Applied to both fuchsia and viridian for consistency.
2025-10-09 11:30:00 +08:00
15b4851e8e
refactor(borgbackup): implement shared staging with defense-in-depth
Major improvements to borgbackup configuration for better reliability and
maintainability:

**Shared staging directory:**
- Use single /btrfs-subvolumes directory (was /subvolumes-{onsite,offsite})
- Eliminates redundant path suffixes in archive structure
- Archive paths now semantic: /btrfs-subvolumes/srv-forgejo clearly indicates
  BTRFS subvolume content without redundant backup job metadata

**Defense-in-depth protection:**
- Layer 1: Systemd ordering - offsite waits for onsite completion
- Layer 2: Self-healing preHook - auto-cleanup orphaned snapshots from
  crashes/power loss
- Prevents cascading failures from race conditions or abnormal terminations

**Code quality improvements:**
- Extract subvolume lists to reduce duplication (DRY principle)
- Add /* sh */ syntax hints for proper editor highlighting
- Silent operation for consistency with existing hooks
- Improved readability with clearer comments and formatting
- All lines ≤ 100 characters

**Timing:**
- Offsite: *-*-* 00:15:00 (daily at 12:15 AM, waits for onsite)
- Onsite: hourly (unchanged)
2025-10-08 18:46:50 +08:00
37924375a2
refactor(borgbackup): backup from /persist paths instead of bind mounts
Update backup paths to use actual persistent storage locations (/persist/*) rather than bind-mounted paths, making it clear where data truly resides and simplifying restore operations.
2025-10-08 15:58:23 +08:00
26c08000a0
refactor(borgbackup): use visible directories with semantic subvolume names
Changes staging directories from hidden to visible and aligns backup paths with actual BTRFS subvolume naming conventions for better clarity when browsing archives.
2025-10-08 15:53:15 +08:00
359d01c407
fix(borgbackup): enable persistent timers for offsite backups
Adds persistentTimer=true to both fuchsia and viridian offsite backup configurations to ensure backups run on next boot if the system was asleep at the scheduled time. Without this, daily backups would be skipped entirely until the next scheduled run.
2025-10-08 08:04:57 +08:00
8874c88fbc
fix(ssh): enable key-based root login and use FQDNs for system services
Fixes backup system authentication and hostname resolution issues.

Changes:
- Change PermitRootLogin from "no" to "prohibit-password" in global SSH config
  (allows key-based root login for host-to-host backups while blocking passwords)
- Update fuchsia onsite backup to use viridian.home.arpa FQDN instead of shortname
- Update SSH knownHosts to use FQDNs (fuchsia.home.arpa, viridian.home.arpa)
  (system-level config uses FQDNs, user shortcuts remain in home-manager)

This enables the complete 3-2-1 backup strategy with automated backups working
correctly between fuchsia and viridian, and fuchsia to BorgBase.
2025-10-07 23:11:31 +08:00
85dc419349
refactor(ssh): decentralize SSH configuration to per-host services
Restructures SSH trust relationships from global to host-specific configuration
for better locality of concern and principle of least privilege.

Changes:
- Collapse nixos/common/global/ssh/ back to ssh.nix (single-file module)
- Move internal host trust (fuchsia/viridian) to per-host services/ssh/
- Split BorgBase known hosts by repository (li9kg944 for fuchsia, r7ag7x1w for viridian)
- Add viridian SSH server config to accept backup connections from fuchsia
- Add fuchsia borgbackup passphrase for offsite backups
- Configure viridian to create /srv/borg-repo/fuchsia for remote backups

This enables the 3-2-1 backup strategy with fuchsia backing up to both viridian
(onsite) and BorgBase (offsite) with proper SSH authentication.
2025-10-07 22:33:20 +08:00
acab920858
WIP: SSH configuration restructure
Backup of SSH reorganization changes for future reference.
2025-10-07 20:58:09 +08:00
a6fa8866ac
feat(fuchsia): implement backup strategy with explicit home paths
Add snapper and borgbackup for fuchsia home directory backups:

Snapper Configuration:
- Hourly snapshots of /home/sajenim
- Retention: 24 hourly, 7 daily, 4 weekly, 12 monthly
- Stored in nested .snapshots subvolume

Borgbackup Onsite:
- Backup to viridian over SSH (local network)
- Target: ssh://viridian/srv/borg-repo/fuchsia
- Hourly backups, unencrypted, deduplicated
- Same retention as snapper

Borgbackup Offsite:
- Backup to borgbase (internet)
- Target: li9kg944@li9kg944.repo.borgbase.com:repo
- Daily backups, encrypted (repokey-blake2), deduplicated
- Retention: 7 daily, 4 weekly, 12 monthly

Explicit Home Paths (valuable user data only):
- Documents, Pictures, Videos, Music, Downloads, Academics, Notes
- Dotfiles: .ssh, .gnupg

System Persist Data:
- SSH host keys, machine-id, nixos state
- Bluetooth, NetworkManager configurations

Intentionally Excluded:
- .config (managed declaratively via home-manager)
- .repositories (cloneable from GitHub)
- .cache and build artifacts

Treats viridian as central backup server, maintaining 3-2-1 strategy
(3 copies, 2 locations, 1 offsite).

chore(viridian): remove unused inputs parameter from borgbackup offsite
2025-10-07 19:14:11 +08:00
f24a7476a7
feat(viridian): add explicit persist data to backup strategy
Add critical system state from persist.nix to borgbackup jobs:
- SSH host keys (required for borg authentication)
- machine-id and nixos state
- Network and bluetooth configurations

Paths mirror persist.nix configuration for maintainability.
Service-specific persist data (traefik, crowdsec) excluded -
will create dedicated subvolumes if/when needed.
2025-10-07 17:06:45 +08:00
7833d89d86
fix(viridian): resolve backup system initialization issues
Fix snapper and borgbackup jobs to work with ephemeral-btrfs setup:

Snapper fixes:
- Remove global /.snapshots mount (use nested subvolumes instead)
- Remove unused hostname variable
- Snapshots now stored in .snapshots subvolumes within each service

Borgbackup fixes:
- Add systemd.tmpfiles.rules to create staging directories at boot
- Add readWritePaths for staging directories (systemd sandboxing)
- Staging directories survive ephemeral root wipes

Architecture notes:
- Nested .snapshots subvolumes don't require separate mounts
- systemd tmpfiles ensures directories exist before services start
- ProtectSystem=strict requires explicit ReadWritePaths allowlist
2025-10-07 09:38:07 +08:00
c05598d9e0
feat(viridian): implement comprehensive 3-2-1 backup strategy
Add automated snapshot and backup system with three independent tiers:

Snapper (hourly local snapshots):
- Configure snapper for all srv-* subvolumes
- Tiered retention: 24 hourly, 7 daily, 4 weekly, 12 monthly
- Snapshots stored at /.snapshots on viridian drive
- Provides fast operational rollback for user errors

Borgbackup onsite (hourly local backups):
- Independent staging snapshots at /.staging-onsite
- Repository on data drive at /srv/borg-repo
- Unencrypted (physical security assumed)
- Matches snapper retention policy
- Fast local disaster recovery

Borgbackup offsite (daily remote backups):
- Independent staging snapshots at /.staging-offsite
- Encrypted backups to borgbase repository
- Retention: 7 daily, 4 weekly, 12 monthly
- Remote disaster recovery with prune policy

Architecture decisions:
- Separate staging directories prevent job conflicts
- Staging snapshots decouple borg jobs from snapper
- Consistent zstd,9 compression across both borg jobs
- Special case handling for containers subvolume path
2025-10-06 20:59:26 +08:00
b0bfb37d3c
refactor(viridian): migrate service data to dedicated BTRFS subvolumes
Migrate from path-based persistence (/persist/var/lib/*) to dedicated
BTRFS subvolumes for better data isolation and snapshot capabilities.

- Move valuable user-facing services to /srv/* with srv-* subvolumes:
  - forgejo: git repositories and database
  - opengist: paste data
  - minecraft: game world data
  - lighttpd: static web content
  - containers: OCI container volumes

- Update home directory to use hm-sajenim subvolume on viridian disk
- Remove jupyterhub service (no longer in use)
- Update borgbackup paths to match new service locations
- Follow upstream service defaults where possible for maintainability

Services kept on /persist (disposable state):
- traefik, crowdsec, murmur
2025-10-06 13:07:46 +08:00
591346600f
refactor: centralize unfree package allowlists
Move all allowUnfreePredicate declarations to global configs to prevent
the "last definition wins" merging issue. Unfree packages are now managed
in two central locations:
- NixOS system packages: nixos/common/global/default.nix
- Home Manager packages: home-manager/sajenim/global/default.nix
2025-10-01 10:23:20 +08:00
e5d1ba38d4
remove ollama service from fuchsia host
- Remove ollama service configuration and dependencies
- Clean up traefik routing for ollama web interface
- Comment out traefik service examples for clarity
2025-09-29 18:30:59 +08:00
969075a5de
refactor traefik + add open-webui service 2025-09-27 10:16:18 +08:00
bce8012209
chore: add all 2025-09-01 01:48:47 +08:00
7f5baabb23
remove project send 2025-09-01 01:47:13 +08:00
323820f797
fix: mariadb uses id 999 by default kinda suss 2025-08-07 22:26:18 +08:00
03a597ae6d
feat: setup projectsend docker containers 2025-08-07 21:41:33 +08:00
ffe0850ac9
backup opengist directory 2025-07-05 20:13:42 +08:00
961bfc2afb
setup opengist 2025-07-05 20:07:57 +08:00
2e635ce32f
update modpack 2025-06-15 08:48:29 +08:00
f4ac9c1753
chore: update server + refactor 2025-06-13 20:39:36 +08:00
23d1a07f26
fix: wrong port 2025-06-12 16:32:52 +08:00
af2fccb12f
update minecraft 2025-06-06 22:22:51 +08:00
ed9a836d2d
refactor 2025-06-06 18:35:13 +08:00
7b981cc126
setup irc network 2025-06-06 18:31:08 +08:00
f7fcccac4a
install murmur 2025-06-04 23:38:48 +08:00
18396e3ad4
remove allowlist 2025-05-30 15:46:35 +08:00
0e27c72344
setup jupyterhub 2025-05-05 08:12:27 +08:00
c38f58067a
opt in unfree 2025-04-30 12:18:32 +08:00
d611a670c5
chore: fix crowdsec 2025-04-03 19:03:15 +08:00
34c586aa9b
chore: update borg repo and passphrase 2025-03-26 13:34:07 +08:00
f26c63e3d8
chore: update backup directories 2025-03-23 23:27:00 +08:00
beb87db0bc
chore: migrate minecraft datadir 2025-03-23 23:16:43 +08:00
bb20d6c5f0
chore: remove redundant settings 2025-03-23 23:13:55 +08:00
8a66dfcaea
chore: remove unused services 2025-03-23 21:23:42 +08:00
e6b6325ba6
chore: refactor 2025-03-08 14:18:46 +08:00
822e6cdf9f
fix: Update NFS export IP address format 2025-02-18 21:56:49 +08:00
977fe7b608
bump multimedia tags 2025-02-16 11:58:25 +08:00
f0330126f9
migrate to github 2025-02-16 10:51:33 +08:00
2e37cefe3e
persist /var/private globally 2024-12-21 21:55:23 +08:00
579bf1a5db
migrate middlewares to entrypoint + refactor 2024-11-28 22:24:27 +08:00
205f85271b
enable whitelist for ipv4 ranges 2024-11-28 22:23:04 +08:00
12d1bd94a3
remove immich from borgbackups 2024-11-24 09:37:46 +08:00
3df22f9eb0
fix crowdsec/traefik 2024-11-24 09:36:36 +08:00
376627ba84
bump tags 2024-11-22 07:24:36 +08:00
1ecf47b006
migrate to 24.11 2024-11-22 07:17:23 +08:00