Currently, I run Unraid and have all of my services’ setup there as docker containers. While this is nice and easy to setup initially, it has some major downsides:

  • It’s fragile. Unraid is prone to bugs/crashes with docker that take down my containers. It’s also not resilient so when things break I have to log in and fiddle.
  • It’s mutable. I can’t use any infrastructure-as-code tools like terraform, and configuration sort of just exist in the UI. I can’t really roll back or recover easily.
  • It’s single-node. Everything is tied to my one big server that runs the NAS, but I’d rather have the NAS as a separate fairly low-power appliance and then have a separate machine to handle things like VMs and containers.

So I’m looking ahead and thinking about what the next iteration of my homelab will look like. While I like unraid for the storage stuff, I’m a little tired of wrangling it into a container orchestrator and hypervisor, and I think this year I’ll split that job out to a dedicated machine. I’m comfortable with, and in fact prefer, IaC over fancy UIs and so would love to be able to use terraform or Pulumi or something like that. I would prefer something multi-node, as I want to be able to tie multiple machines together. And I want something that is fault-tolerant, as I host services for friends and family that currently require a lot of manual intervention to fix when they go down.

So the question is: how do you all do this? Kubernetes, docker-compose, Hashicorp Nomad? Do you run k3s, Harvester, or what? I’d love to get an idea of what people are doing and why, so I can get some ideas as to what I might do.

  • nico@r.dcotta.eu
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    10 months ago

    I see no one else commented my stack, so I suggest:

    Nomad for managing containers if you want something high availability. Essentially the same as k8s but much much much simpler to deploy, learn, and maintain. Perfect for homelabs imo. Most of the concepts of Nomad translate well to k8s if you do want to learn it later. It integrates really well with Terraform too if you are also hoping to learn that, but it’s not a requirement.

    NixOS for managing the bare metal. It’s a lot more work to learn than say, Debian, but it is just as stable, and all configuration will be defined as code, down to the bootloader config (no bash scripts!). This makes it super robust. You can also deploy it remotely. Once you grow beyond a handful of nodes it’s important to use a config management tool, and Nix has been by far my favourite so far.

    If you really want everything to be infra-as-code, you can manage cloud providers via Terraform too.

    For networking I use wireguard, and configure it with NixOS. Specifically, I have a mesh network where every node can reach every node without extra hops. This is a requirement if you don’t want a single point of failure (hub and spoke) to disconnect your entire cluster.

    Everything in my setup is defined ‘as-code’, immutable, and multi-node (I have 7 machines) which seems to be what you want, from what you say in your post. I’ll leave my repo here, and I’m happy to answer questions!

    My opinions on the alternatives:

    Docker compose is great but doesn’t scale if you want high availability (ie, have a container be rescheduled on node failure). If you don’t want higher availability, anything more than docker might be overkill.

    Ansible and Puppet are alright but are super stateful, and require scripting. If you want immutability you will love Nix/NixOS

    k8s works (I use it at work) but is extremely hard to get right, even for well-resourced infra teams. Nomad achieves the same but with the leanings of having come afterwards, and without the history.

    • jkrtn@lemmy.ml
      link
      fedilink
      English
      arrow-up
      0
      ·
      10 months ago

      Could you give a quick example of using NixOS configuration to launch a machine or deploying something remotely? I’m just starting to move beyond a single machine at home. I’d really like to get transition to infra as code.

      • nico@r.dcotta.eu
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        I recommend starting with ZeroToNix’s docs and then moving on to nixos.wiki, but here is a minimal, working example that I could deploy to a hetzner VPS that only has nix and ssh installed:

        { config, pkgs, ... }: {
          # generated, this will set up partitions and bootloader in a separate file
          imports = [ ./hardware-configuration.nix ];
          zramSwap.enable = true;
          networking.hostName = "miki";
          # configures SSH daemon with a public key so we can ssh in again
          services.openssh.enable = true;
          users.users.root.openssh.authorizedKeys.keys = [ ''ssh-ed25519 AAAAC3NzaC1lNDI1NTE5AAAAIPJ7FM3wEuWoVuxRkWnh9PNEtG+HOcwcZIt6Qg/Y1jka'' ];
          # creates a timmy user with sudo access and wget installed
          users.users.timmy = {
            isNormalUser = true;
            extraGroups = [ "networkmanager" "wheel" "sudo" ];
            packages = with pkgs; [ wget ];
          };
          # open up SSH port
          networking.firewall.allowedTCPPorts = [ 22 ];
          # start nginx, assumes HTML is present at `/var/www`
          services.nginx = {
            enable = true;
            virtualHosts."default" = {
              forceSSL = true;            # Redirect HTTP clients to an HTTPs connection
              default = true;             # Always use this host, no matter the host name
              root = /var/www;        # Set the web root to ser
            };
          };
          system.stateVersion = "22.11";
        }
        

        This sets up a machine, configures the usual stuff like the ssh daemon, creates a user, and sets up an nginx server. To deploy it you would run nixos-rebuild --target-host root@10.0.0.1 switch. Other tools exist (I use colmena but the idea is the same). Note how easy it was to set up nginx! If I was setting Nomad up, I would just do services.nomad.enable = true.

        As you can see some things you will have to learn (the nix language, what the configs are…) but I think it is worth it.

  • sabreW4K3@lemmy.tf
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 months ago

    I can’t remember what I was watching, but I remember watching something where they said Kubernetes is designed for something so large in scale that the only reason people have heard about it is because some product manager asked what Google use and then demanded that they use it to replicate the success of Google and subsequently, hobbyists also followed and now a bunch of people are using stuff that’s poorly optimized for such small scale systems.

  • monkeyman512@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    10 months ago

    I would stay away from kubernets/k3/k8s. Unless you want to learn it for work purposes, it’s so overkill you can spend a month before you get things running. I know from experience. My current setup gives you options and has been reliable for me.

    NAS Box: Truenas Scale - You can have UnRaid fill this role.

    Services Hosting: Proxmox - I can spin up any VMs I need and lots of info online to do things like hardware passthrough to VMs.

    Containers: Debian VM - Debian makes a great server environment as it’s stable and well supported. I just make this VM a docker swarm host. I managed things with Portainer for a web interface.

    I keep data on the NAS and have containers access it over the network. Usually a NFS share.

    • nopersonalspace@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      10 months ago

      How do you manage your services on that, docker compose files? I’m really trying to get away from the workflow of clicking around in some UI to configure everything, only for it to glitch out and disappear and I have to try and remember what things to click to get it back. It was my main problem with portainer that caused me to move away from it (I have separate issues with docker-compose but that’s another thing)

      • khorak@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        10 months ago

        I personally stepped away from compose. You mentioned that you want a more declarative setup. Give Ansible a try. It is primarily for config management, but you can easily deploy containerized apps and correlate configs, hosts etc.

        I usually write roles for some more specialized setups like my HTTP reverse proxy, the arrs etc. Then I keep everything in my inventory and var files. I’m really happy and I really can tear things down and rebuild quickly. One thing to point out is that the compose module for Ansible is basically unusable. I use the docker container module instead. Works well so far and it keeps my containers running without restarting them unnecessarily.