On-demand Ollama in the homelab on tomleb's blog
Here’s how I achieve on-demand Ollama in my homelab without increasing my electricity bill through the roof.
High-level overview
The core of the solution lies in configuring a reverse proxy to handle requests to Ollama. If Ollama is not reachable, a special handler kicks in, forwarding the request to ollama-wake-on-lan - referred to as WoL service in the rest of this post.
The WoL service will attempt to wake up the Ollama host and seamlessly redirect the original request back to Ollama. To achieve this, the WoL service will block the request and send Wake-on-Lan packets to wake up the host.
Implementation
I am using Caddy as a reverse proxy. Here’s an example Caddyfile
configuration
that implements the above solution.
*.example.tld {
@ollama host ollama.example.tld ollama2.example.tld
handle @ollama {
reverse_proxy http://<ollama IP>:11434
}
This first part simply configures Caddy to act as a reverse proxy for requests
coming with a Host header matching either ollama.example.tld
or
ollama2.example.tld
. The latter hostname will be used as the redirect in the
ollama-wake-on-lan service, as browsers may not automatically follow redirects
to the original hostname.
The configuration continues like so:
handle_errors {
@ollama_main host ollama.example.tld
handle @ollama_main {
reverse_proxy http://<ollama-wake-on-lan ip>:4000
}
}
}
This configures a special handler if Caddy fails to reach Ollama configured in
the first snippet. In that case, Caddy will reverse proxy to the WoL service,
but only if the Host header matches ollama.example.tld
. This prevents
infinitely sending requests to the WoL service.
The only thing left is running the WoL service.
$ go install git.sr.ht/~tomleb/ollama-wake-on-lan@master
$ ollama-wake-on-lan -broadcast <broadcast address> \
-mac <mac address> \
-url <ollama2 url>
That’s it! Now if your Ollama host is down and you try to use its API, Caddy will proxy the request to the WoL service, which will seamlessly wake up the host and will redirect the author of the request to Ollama again.
In my case, I run open-webui, and simply accessing it triggers the above solution, making it even more seamless from a user pov.
Contribute to the discussion in my public inbox by sending an email to ~tomleb/public-inbox@lists.sr.ht [mailing list etiquette]