Ron from Flox here, woke up to feed a brand new 3 day old to see this here! On about 3 hours of sleep (over the lat 48 hours) but excited to try and answer some questions! Feel free to also drop any below <3
I used to love both, Kubernetes and Nix. But after a few years of using both I felt like the abstraction levels are a bit too deep.
Sure, it's easy to stand up a mail server in NixOS, or to just use docker/kubernetes to deploy stuff. But after a few years it felt like I don't have a single understanding of the stack. When shit hits the fan, it makes it very difficult to troubleshoot.
I am now back on running my servers on FreeBSD/OpenBSD and jails or VMM respectively. And also dumbing the stack down to just "run it in a jail, but set it up manually".
The only outlier is Immich. For some reason they only officially support the docker images but not a single clear instruction on how to set it up manually. Sure, I could look at the Dockerfiles, but many of the scripts also expect docker to be present.
And now that FreeBSD also has reproducible builds, it took one more stone away from Nix.
Going to sound weird but with both my hats on I super appreciate this perspective. I can only speak to some areas of Nix and Flox obviously and I know folks are looking into doing this to your point a whole lot better. Zooming in way more into solving for us that just want to run and fix it fast when it breaks.
Also, think it's a huge ecosystem win for FreeBSD pushing on reproducibility too. I think we are trending in a direction where this just becomes a critical principle for certain stacks. (also needed when you dive into AI stacks/infra...)
Yes, but I also think that the BSDs are the last bastions you will find any AI usage in. And I for one am grateful for that.
I like it when my system comes with a complete set of manpages and good docs.
But you mentioned Flox, which I didn't even know about. First I thought that's what they renamed the Nix fork to after the schism, but now I see it's a paid product and yuck...just further deepens my believe in going more bare bones manual control, even if sometimes bothersome.
We have six dev teams and are just about done with migrating to k8s. It's an immense improvement over what we had before.
It's a version of Greenspun's tenth rule: "Any sufficiently complicated distributed system contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Kubernetes."
I think six dev teams is small in terms of kube. I wouldn’t be surprised if that’s close to the perfect size to move onto kube and create and adopt a standard set of platform idioms.
at orgs significantly larger than that, the kube team has to aggressively spin out platform functions that enable further layering or risk getting overwhelmed trying to support and configure kube features to cover diverse team needs (that is, storage software doesn’t have the same needs or concerns as middleware or the frontend). this incubator model isn’t easy in practice. trying to adopt kube at this scale is very challenging because it requires the kube team to spin up and out sub-teams at a very high rate or risk slowing the migration down to a crawl or outright failure and purchasing e.g. off the shelf AWS because teams need to offboard their previous platform.
When I worked on an enterprise data analytics platform, a big problem was docker image growth. People were using different python versions, different cuda versions, all kinds of libraries. With Cuda being over a gigabyte, this all explodes.
The solution is to decompose the docker images and make sure that every layer is hash equivalent. So if people update their Cuda version, it result in a change within the Python layers.
But it looks like Flox now simplifies this via Nix. Every Nix package already has a hash and you can combine packages however you would like.
Yes, this hits the nail on the head. We’ve seen the same explosion in image size and rebuild complexity, especially with AI/ML workloads where Python + CUDA + random pip wheels + system libs = image bloat and massive rebuilds.
With the Kubernetes shim, you can run the hash-pinned environments without building or pulling an image at all. It starts the pod with a stub, then activates the exact runtime from a node-local store.
I was an early and enthusiastic adopter of docker. I really liked how it would let me use layers to keep track of dependency between files.
After spending a few years using nix, the docker image situation looks pretty bonkers. If two files end up in separate layers, the system assumes dependency so if the lower file changes you need to build a separate copy of the higher one just in case there's actual dependency there.
Within nix you can be more precise about what depends on what, which is nice, but you do have to be thoughtful about it or you can summon the same footgun that got you with docker, just in smaller form. Because a nix derivation, while a box with nicely labeled inputs and output, is still a black box. If you insert a readme as an input to a derivation that does a build, nix will assume that the compiled binary depends on it and when you fix a typo in the readme and rebuild you'll end up with a duplicate binary build in the nix store despite the contents of the binary not actually depending on the text of the readme.
> you can combine packages however you would like
So this is true, more or less, but be aware that while nix lets you do this in ways that don't force needless duplication, it doesn't force you to avoid that duplication. Things carelessly packaged with nix can easily recreate the problem you mentioned with docker.
The problem is that whiteouts are not commutative. If the layers you build turn out to be bit for bit identical the layers will be shared anyway, but its much mroe complex than Nix where the composition operation is commutative.
Yes, there were various attempts to do this in the container ecosystem, but there is a hard limit on layers on Docker images (because there are hard limits on overlay mounts; you don't really need to overlay all the Nix store mounts of course as they have different paths but the code is for teh geenral case). So then there were various ways of bundling sets of packages into layers, but just managing it directly through Nix store is much simpler.
What constraints/coordination exists with this, in terms of host driver support? What enforces that Nix does not attempt to use a newer cuda toolkit on a host with an older cuda driver?
Cool if so, I didn't see it prominently linked or mentioned on the landing page. Maintainers: being open source is a big feature, mention it prominently and have your repo links front and center.
Jotting down a few quick thoughts here but we can totally go deep.
This is something Michael Brantley started working on a few months ago to test out how to make it super easy to ease and leverage existing Nix & Flox architecture.
One of the core differences from my quick perspective is that it specifically leverages the unique way that Flox environments are rendered without performing a nix evaluation, making it safe and optimally performant for the k8s node to realize the packages directly on the node, outside of a container.
Wrong. If you know nix then you know "leverages the unique way that Flox environments are rendered without performing a nix evaluation" is a very significant statement.
so kind of allowing pull images from nix store, mounting shared host nix store per node into each container, incremental fast rebuilds, generating basic pod configs are good things.
and local, ci and remote runs same flows and envs.
Jeremy from Flox, here, I want to chime in here so Ron can be with his family, even though he will no doubt be right back on here:
Re: Relationship to nix-snapshotter and prior art
This is original work, though very much built on prior innovations. Our approach hooks into the upstream containerd runc shim to pull the FloxHub-managed environment and bind-mount the closure at startup. The key distinction is that we use how Flox environments are rendered to avoid Nix evaluation entirely, making it safe and fast for a k8s node to realize packages directly on the node. Less about images and containers, per se, and more out bringing the power of Flox and Nix at the buildtime end to the runtime end of SDLC.
The cache story is surprisingly strong: nix store paths effectively behave like layers in the node’s registry, but with dramatically higher hit rates -- often across entirely unrelated pod deployments. Because all pods rely on the same underlying system libraries drawn from the “quantized” Flox catalog, different environments naturally share glibc, core utilities, and common dependencies, where traditional containers typically share nothing.
Tools like nix-snapshotter, Nixery, and others have pioneered this space and we're grateful for that work. This rising "post-Docker" tide raises all ships.
Re: Open Source
The software is brand new -- only slightly older than Ron’s baby -- and currently in alpha. KubeCon was our first opportunity for broad feedback, and we uncovered a few issues we’re still addressing. Our intent is to open-source the project once we’ve fully vetted the approach, ideally in the coming weeks.
Yes, we launched early and the product is imperfect, but we’re doing so transparently and with a commitment to getting it right and releasing it to the community, we will continue to release early and often.
Re: Abstraction depth concerns
I appreciate @rootnod3’s point about deeper abstractions complicating debugging. We’re thinking hard about how to keep things simple for people who need to run and fix systems quickly. It’s encouraging to see the broader ecosystem—like FreeBSD—lean further into reproducibility, especially as AI-centric stacks make this increasingly important.
Re: Nix vs traditional approaches
Skilled Dockerfile authors can achieve great caching results -- and you can pin and you can prune registries, etc -- but our goal is to make these best practices the default. Nix enables finer-grained caching and a universal packaging format for building and consuming open source software.
We see intrinsic value in Flox environments -- whether on the CLI, k8s, Nomad down the road, or other platforms. Our aim is for Flox environments to be as universal and natural as Nix packages themselves -- essentially extending “flox activate” into the k8s world.
We likewise got a ton of valuable feedback at KubeCon, most of which was validating, all of which was very inline with this conversation.
So, nix-snapshotter? Also, Flox going all in on "environments" seems like such a choice. I'm sure that Flox is not encouraging shipping a binary-in-a-devshell to Prod, so it seems an interesting branding decision.
It's hard for me to understand if I should be excited about this. I think companies do themselves such huge disservices from not being transparent to the nerds that WILL be the ones helping choose/implement these things. Instead of the current feeling I have, there could be three sentences that explains what Flox is offering here beyond what *anyone* can go do right now with nix-snapshotter.
If it's ecosystem stuff (you get Flox's CI, or CLI, or whatever else), that's not very well sold to me on the landing page. Otherwise I'm feeling left empty-handed.
Totally valid - we buried the lede here. Quick version:
Not nix-snapshotter because we skip Nix eval entirely and get way better cache sharing across unrelated workloads (quantized catalog means everything shares base deps). On "environments": these aren't devshells-as-prod, they're the actual runtime; same as 'flox activate' works everywhere. You're shipping a declarative, hash-pinned runtime that happens to also work great in dev/CI
And yeah, we should have been upfront that this is alpha and we're planning to open source it after vetting at KubeCon.
You're right that we're doing ourselves a disservice not being transparent with the technical crowd. What specific technical details would help you evaluate this?
You can say what you want about Kube (it's a bit of a necessary evil for the people that need it), but keep Nix's name out yo damn mouth. It's for real.
Ron from Flox here, woke up to feed a brand new 3 day old to see this here! On about 3 hours of sleep (over the lat 48 hours) but excited to try and answer some questions! Feel free to also drop any below <3
We did just launch this last week after a good bit of work from the team. Steve wrote up a deeper technical dive here if anyone is interested - https://flox.dev/blog/kubernetes-uncontained-explained-unloc...
congrats on the little one, here’s to many wonderful moments.
online community love was not in my cards going into day 3 of a newborn but I'll take it + definitely needed! thank you!
I used to love both, Kubernetes and Nix. But after a few years of using both I felt like the abstraction levels are a bit too deep.
Sure, it's easy to stand up a mail server in NixOS, or to just use docker/kubernetes to deploy stuff. But after a few years it felt like I don't have a single understanding of the stack. When shit hits the fan, it makes it very difficult to troubleshoot.
I am now back on running my servers on FreeBSD/OpenBSD and jails or VMM respectively. And also dumbing the stack down to just "run it in a jail, but set it up manually".
The only outlier is Immich. For some reason they only officially support the docker images but not a single clear instruction on how to set it up manually. Sure, I could look at the Dockerfiles, but many of the scripts also expect docker to be present.
And now that FreeBSD also has reproducible builds, it took one more stone away from Nix.
Going to sound weird but with both my hats on I super appreciate this perspective. I can only speak to some areas of Nix and Flox obviously and I know folks are looking into doing this to your point a whole lot better. Zooming in way more into solving for us that just want to run and fix it fast when it breaks.
Also, think it's a huge ecosystem win for FreeBSD pushing on reproducibility too. I think we are trending in a direction where this just becomes a critical principle for certain stacks. (also needed when you dive into AI stacks/infra...)
Yes, but I also think that the BSDs are the last bastions you will find any AI usage in. And I for one am grateful for that.
I like it when my system comes with a complete set of manpages and good docs.
But you mentioned Flox, which I didn't even know about. First I thought that's what they renamed the Nix fork to after the schism, but now I see it's a paid product and yuck...just further deepens my believe in going more bare bones manual control, even if sometimes bothersome.
Kubernetes can be a godsend at larger orgs.
We have six dev teams and are just about done with migrating to k8s. It's an immense improvement over what we had before.
It's a version of Greenspun's tenth rule: "Any sufficiently complicated distributed system contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Kubernetes."
I think six dev teams is small in terms of kube. I wouldn’t be surprised if that’s close to the perfect size to move onto kube and create and adopt a standard set of platform idioms.
at orgs significantly larger than that, the kube team has to aggressively spin out platform functions that enable further layering or risk getting overwhelmed trying to support and configure kube features to cover diverse team needs (that is, storage software doesn’t have the same needs or concerns as middleware or the frontend). this incubator model isn’t easy in practice. trying to adopt kube at this scale is very challenging because it requires the kube team to spin up and out sub-teams at a very high rate or risk slowing the migration down to a crawl or outright failure and purchasing e.g. off the shelf AWS because teams need to offboard their previous platform.
When I worked on an enterprise data analytics platform, a big problem was docker image growth. People were using different python versions, different cuda versions, all kinds of libraries. With Cuda being over a gigabyte, this all explodes.
The solution is to decompose the docker images and make sure that every layer is hash equivalent. So if people update their Cuda version, it result in a change within the Python layers.
But it looks like Flox now simplifies this via Nix. Every Nix package already has a hash and you can combine packages however you would like.
Yes, this hits the nail on the head. We’ve seen the same explosion in image size and rebuild complexity, especially with AI/ML workloads where Python + CUDA + random pip wheels + system libs = image bloat and massive rebuilds.
With the Kubernetes shim, you can run the hash-pinned environments without building or pulling an image at all. It starts the pod with a stub, then activates the exact runtime from a node-local store.
I was an early and enthusiastic adopter of docker. I really liked how it would let me use layers to keep track of dependency between files.
After spending a few years using nix, the docker image situation looks pretty bonkers. If two files end up in separate layers, the system assumes dependency so if the lower file changes you need to build a separate copy of the higher one just in case there's actual dependency there.
Within nix you can be more precise about what depends on what, which is nice, but you do have to be thoughtful about it or you can summon the same footgun that got you with docker, just in smaller form. Because a nix derivation, while a box with nicely labeled inputs and output, is still a black box. If you insert a readme as an input to a derivation that does a build, nix will assume that the compiled binary depends on it and when you fix a typo in the readme and rebuild you'll end up with a duplicate binary build in the nix store despite the contents of the binary not actually depending on the text of the readme.
> you can combine packages however you would like
So this is true, more or less, but be aware that while nix lets you do this in ways that don't force needless duplication, it doesn't force you to avoid that duplication. Things carelessly packaged with nix can easily recreate the problem you mentioned with docker.
The problem is that whiteouts are not commutative. If the layers you build turn out to be bit for bit identical the layers will be shared anyway, but its much mroe complex than Nix where the composition operation is commutative.
Yes, there were various attempts to do this in the container ecosystem, but there is a hard limit on layers on Docker images (because there are hard limits on overlay mounts; you don't really need to overlay all the Nix store mounts of course as they have different paths but the code is for teh geenral case). So then there were various ways of bundling sets of packages into layers, but just managing it directly through Nix store is much simpler.
https://github.com/pdtpartners/nix-snapshotter/blob/main/doc...
What constraints/coordination exists with this, in terms of host driver support? What enforces that Nix does not attempt to use a newer cuda toolkit on a host with an older cuda driver?
Too bad this isn't open source, I'm 3/4ths of the way through building pretty much this exact product in order to support my actual products.
Is it not GPL?
The license file in their github seems to indicate that it is. https://github.com/flox/flox?tab=GPL-2.0-1-ov-file
Cool if so, I didn't see it prominently linked or mentioned on the landing page. Maintainers: being open source is a big feature, mention it prominently and have your repo links front and center.
How does this differ from the tooling that lets you build containers from nix?
Jotting down a few quick thoughts here but we can totally go deep. This is something Michael Brantley started working on a few months ago to test out how to make it super easy to ease and leverage existing Nix & Flox architecture. One of the core differences from my quick perspective is that it specifically leverages the unique way that Flox environments are rendered without performing a nix evaluation, making it safe and optimally performant for the k8s node to realize the packages directly on the node, outside of a container.
I read this a few times but there's no info.
Wrong. If you know nix then you know "leverages the unique way that Flox environments are rendered without performing a nix evaluation" is a very significant statement.
> leverages the unique way that Flox environments are rendered without performing a nix evaluation"
I'm curious! and ignorant! help!
Is that via (centrally?) cached eval? or what? there's only so much room for magic in this arena.
seems similar to this
https://github.com/pdtpartners/nix-snapshotter
so kind of allowing pull images from nix store, mounting shared host nix store per node into each container, incremental fast rebuilds, generating basic pod configs are good things.
and local, ci and remote runs same flows and envs.
There was also Nixery paving the way
Jeremy from Flox, here, I want to chime in here so Ron can be with his family, even though he will no doubt be right back on here:
Re: Relationship to nix-snapshotter and prior art This is original work, though very much built on prior innovations. Our approach hooks into the upstream containerd runc shim to pull the FloxHub-managed environment and bind-mount the closure at startup. The key distinction is that we use how Flox environments are rendered to avoid Nix evaluation entirely, making it safe and fast for a k8s node to realize packages directly on the node. Less about images and containers, per se, and more out bringing the power of Flox and Nix at the buildtime end to the runtime end of SDLC.
The cache story is surprisingly strong: nix store paths effectively behave like layers in the node’s registry, but with dramatically higher hit rates -- often across entirely unrelated pod deployments. Because all pods rely on the same underlying system libraries drawn from the “quantized” Flox catalog, different environments naturally share glibc, core utilities, and common dependencies, where traditional containers typically share nothing.
Tools like nix-snapshotter, Nixery, and others have pioneered this space and we're grateful for that work. This rising "post-Docker" tide raises all ships.
Re: Open Source The software is brand new -- only slightly older than Ron’s baby -- and currently in alpha. KubeCon was our first opportunity for broad feedback, and we uncovered a few issues we’re still addressing. Our intent is to open-source the project once we’ve fully vetted the approach, ideally in the coming weeks.
Yes, we launched early and the product is imperfect, but we’re doing so transparently and with a commitment to getting it right and releasing it to the community, we will continue to release early and often.
Re: Abstraction depth concerns I appreciate @rootnod3’s point about deeper abstractions complicating debugging. We’re thinking hard about how to keep things simple for people who need to run and fix systems quickly. It’s encouraging to see the broader ecosystem—like FreeBSD—lean further into reproducibility, especially as AI-centric stacks make this increasingly important.
Re: Nix vs traditional approaches Skilled Dockerfile authors can achieve great caching results -- and you can pin and you can prune registries, etc -- but our goal is to make these best practices the default. Nix enables finer-grained caching and a universal packaging format for building and consuming open source software.
We see intrinsic value in Flox environments -- whether on the CLI, k8s, Nomad down the road, or other platforms. Our aim is for Flox environments to be as universal and natural as Nix packages themselves -- essentially extending “flox activate” into the k8s world.
We likewise got a ton of valuable feedback at KubeCon, most of which was validating, all of which was very inline with this conversation.
So, nix-snapshotter? Also, Flox going all in on "environments" seems like such a choice. I'm sure that Flox is not encouraging shipping a binary-in-a-devshell to Prod, so it seems an interesting branding decision.
It's hard for me to understand if I should be excited about this. I think companies do themselves such huge disservices from not being transparent to the nerds that WILL be the ones helping choose/implement these things. Instead of the current feeling I have, there could be three sentences that explains what Flox is offering here beyond what *anyone* can go do right now with nix-snapshotter.
If it's ecosystem stuff (you get Flox's CI, or CLI, or whatever else), that's not very well sold to me on the landing page. Otherwise I'm feeling left empty-handed.
Totally valid - we buried the lede here. Quick version:
Not nix-snapshotter because we skip Nix eval entirely and get way better cache sharing across unrelated workloads (quantized catalog means everything shares base deps). On "environments": these aren't devshells-as-prod, they're the actual runtime; same as 'flox activate' works everywhere. You're shipping a declarative, hash-pinned runtime that happens to also work great in dev/CI
And yeah, we should have been upfront that this is alpha and we're planning to open source it after vetting at KubeCon.
You're right that we're doing ourselves a disservice not being transparent with the technical crowd. What specific technical details would help you evaluate this?
[flagged]
You can say what you want about Kube (it's a bit of a necessary evil for the people that need it), but keep Nix's name out yo damn mouth. It's for real.