More auto-configuration in Mollymawk
2026-02-13In a previous article we described our previous work on automatic configuration of MirageOS unikernels. Since then we have continued working on this, and we now have automatic configuration of log server, metrics sink and ACME (Let's Encrypt) DNS-01 challenge provisioned certificates. The same mechanism can be extended to configure things that we would otherwise use boot parameters for.
Passing the options through DHCP has a number of advantages. Options that are the same for many unikernels don't need to be repeated for each unikernel. The very obvious example is DNS servers - something that hitherto could not be configured from DNS in Mirage(!) Another example, often the same InfluxDB sink is used for all unikernels on the same network. Or for certificates we avoid each unikernel having to do ACME (Let's Encrypt) challenges and dealing with account private keys or DNS TSIG keys. Read on for more details in how it works and challenges we encountered.
Cutting across the layers
The challenge with these "options" is that they have less to do with the IPv4 stack and are really only relevant higher up in the stack. I say challenge because the Mirage tool and the network libraries are designed in a layered manner similar to the OSI model. In practice the layers are not always as neatly cut. One such example is DHCP. More examples are listed in this issue on Mirage.
The Mirage TCP/IP stack
All network traffic in Mirage starts at the network device layer.
This device has an interface whose main points of interest are the functions write and listen.
Calling listen installs your callback which then receives a buffer each time an ethernet frame arrives.
Inside Tcpip_stack_direct we have all the applied functors of the different layers of the stack and construct a callback that ensures each bit of the ethernet frame arrives at the relevant layer - and it is here that listen is called on the network device.
So far so good.
One issue arrives with this design.
For DHCP we need to send and receive some packets over the network before we get IPv4 configuration.
Previously, we did this by installing a listener on the network device waiting for a DHCP lease and then cancelling the listener.
In practice you can only ever have one listener installed at a time, so we need to uninstall the listener before Tcpip_stack_direct calls listen.
Already here we had an issue.
This meant that when we need to renew the lease we can send out the request, but we are no longer listening for replies, and Tcpip_stack_direct doesn't know what to do with incoming DHCP packets!
To work around this we made a Dhcp_ipv4 module that takes a network device as input, does DHCP and then pretends to be a network device - with a listen function that filters DHCP packets instead of passing them on to the supplied listener.
With this change it becomes transparent to Tcpip_stack_direct that we do DHCP.
How do we get the lease options?
The next question is how do we get the lease options relevant for configuring more application layer things e.g. log server. This means we somehow need to get the DHCP options "higher up", process them and pass them on to e. g. the syslog task. To do that we added a function:
val lease : t -> Dhcp_wire.dhcp_option list option Lwt.t
But not only that!
We also need to tell Dhcp_ipv4 what DHCP options we are interested in!
In functional programming these cyclic dependencies can be a bit challenging.
To solve this a new type Dhcp_requests.t is introduced to the Mirage tool.
Essentially, it's a mutable set of DHCP option codes - except they are represented as integers in order to not add a dependency on the DHCP library just for that.
With this we can pass a Dhcp_requests.t value to Mirage.ipv4_of_dhcp and then later pass it to Mirage.generic_dns_client or Mirage.syslog so they can insert the relevant DHCP option code, as well as a promise of the DHCP options.
I am simplifying a bit here. In the Mirage tool we have an embedded [domain-specific language (DSL)][dsl] called Functoria[1]. Essentially, this is a language for constructing "devices" and describing how a unikernel is started. Or in other words, it's a DSL for generating code applying functors and passing values to functions. Passing around values representing DHCP options and DHCP option codes is stretching a bit what the language is designed for. As such it's been a big challenge to figure out how to do that - especially as we usually have to write only little Functoria code. In fact, we in Robur are working on getting rid of the Mirage tool also for other reasons. More to come on that in another blog post.
With all this we can configure DNS, syslog and metrics sink using DHCP!
Provisioning certificates
We already implemented updating DNS A-records when requested by DHCP clients in DNSvizor (see the previous article. Wouldn't it be nice if we can provision a X.509 certificate with just a DHCP request? We've worked hard to implement this, too! It works by DNSvizor working with a dns-letsencrypt-secondary that does the DNS-01 challenge and puts the resulting certificate in a TLSA record.
Here we had a new challenge.
Not only do we need to say what DHCP option code we are interested in - we also need to send the DHCP server the certificate signing request (csr) so it can notify the dns-letsencrypt-secondary!
This was done using vendor-identifying vendor options (RFC 3925), a mechanism for vendors to extend DHCP with custom options.
On the client side I opted against fighting with Functoria again, and in the sample applied the relevant functors by hand and wrote or copied the otherwise usually generated code.
Otherwise, I would have to create a X.509 private key device, a CSR device and a certificate device and glue it all up together.
Check out the Unikernel'.ml file in this unipi PR to see the code that is usually generated by the Mirage tool.
I took this shortcut because the task was still not done after four months when I estimated the task would take only one month.
To make the CSR fit in the DHCP request I opted for using a ED25519 key at first as it's short - only to find out that it's not supported by Let's Encrypt. Thankfully, P256 keys are also supported and also short. I opted to not return directly the certificate, and instead tell the client which name server to ask the TLSA record for what domain name. This was done partly because certificate chains easily become too large for a DHCP message, and partly because DHCP clients tend to be really impatient and can't wait for a DNS-01 challenge to be solved.
Initially, I only sent the client the domain name and let the client use either the DNS resolvers given from DHCP or the default resolver. However, this turned out to be a bad idea as the client could easily ask for the TLSA record before the challenge was solved. This is a problem because the Mirage DNS client caches replies, and even if configured not to resolvers usually cache replies. With a TTL of 3600 seconds we would easily end up in a situation where getting the certificate would take an hour!
Relevant PRs:
- mirage/charrua#142: dhcp rework incl. exposing lease dhcp options
- mirage/mirage#1628: Add dhcp ipv4, leases etc. to the Mirage tool
- mirage/mirage-skeleton#418: Adapt the mirage-skeleton example unikernels to use the new DHCP features - in particular DNS servers.
- robur-coop/traceroute#7: Example unikernel with syslog using log server from DHCP. Note that
config.mlis overly complex due to it using a special implementation of ICMPv4, andMirage.direct_stackv4v6doesn't allow plugging in a different ICMPv4 implementation. - mirage/charrua#144: add RFC 3925 features. This is what we use for options or arguments that don't fit into the standard DHCP options such as Influx metrics sink or certificate requests. It can be used for arbitrary options within the data size limits of DHCP.
- mirage/charrua#145, #147, #148: fix mistakes in #144