Discoveries of implementing MirageVPN, a OpenVPN™-compatible VPN library

Reynir Björnsson, reynir@reynir.dk

Who am I?

  • Software developer at Robur, a small collective working on Mirage applications in OCaml.
  • Previously also system administrator
  • BSc in computer science from Aarhus University
  • Not a security professional as such

MirageVPN

https://github.com/robur-coop/miragevpn

  • Open source (BSD 2-clause) licensed reimplementation of (parts of) the OpenVPN™ protocol
  • Started in 2019
  • Written in OCaml, a performant statically typed functional programming language with Mirage OS in mind as target
  • First funded in 2019 by Prototypefund, then in 2023 by NGI Assue via NLnet
  • Clients and servers for *nix and Mirage unikernels, and client unikernel for Qubes OS

OpenVPN™

  • Virtual private network (VPN) software
  • More than two decades old
  • Written in C
  • Available on many platforms and embedded in router firmware

Mirage OS (or why do this at all?!)

  • Mirage OS is a library operating system for writing applications that are single purpose operating systems (unikernels)
  • Written in OCaml
  • Same code can target Xen, KVM, bhyve, Linux process with seccomp sandboxing, or a regular *nix process
  • Goes very well with Qubes OS
    • A highly compartmentalized Linux/Xen distribution
    • MirageVPN uses tens of MB memory where a Linux based implementation tends to use hundreds of MB memory
  • VPNs are useful!

Challenges reimplementing the OpenVPN protocol

  • Software and protocol has evolved much over the decades
  • OpenVPN™ has a plethora of modes of operations, hundreds of configuration directives
  • Very little documentation on the protocol. Mostly found scattered in the source tree, but also in the new and very incomplete OpenVPN RFC. MirageVPN has pushed forward documentation efforts by me annoyingly asking many questions in #openvpn-devel
  • Long story of backwards compatibility and legacy

The main mode that we are interested in is the TLS mode. The TLS mode uses a control channel and a data channel multiplexed over UDP or TCP. The data channel is used for tunneling packets. The control channel uses TLS and is used for authenticating, negotiating the protocol and parameters, and sending control messages.

A control message is a text string terminated by a NUL byte sent over the control channel in a single TLS record frame.

EXIT control channel message

The EXIT message signals to the peer that it is exiting. This is especially useful when using UDP so the peer doesn't have to do retransmissions and wait for timeouts. It can carry an optional message.

"EXIT\000"
"EXIT,I quit!\000"

This schedules a five second timer after which the connection is torn down.

Intermezzo

The same (exit) timer is (re)scheduled by EXIT control channel messages and other conditions. So we can postpone exits by sending an EXIT message!

One code path that also uses the same exit timer is the client-kill management command.

The management interface is a text based protocol over a socket (UNIX, TCP). It is used to control the OpenVPN process. According to documentation client-kill immediately kills a client instance.

A rogue employee is doing bad stuff on the company network. The admin notices in time and kicks off the client from the VPN. Unfortunately the rogue employee is well prepared and has his client spam EXIT messages every second and evades the policy enforcement and stays connected.

This bug was assigned CVE-2024-28882. It was fixed (by me 🥳) by checking if the timer was already scheduled before scheduling the exit timer. The bug was ~20 years old!

Next up we have the PUSH_REQUEST and PUSH_REPLY control messages.

  • Request-response protocol
  • Used for sending configuration options from the server to the client
  • Options include client ip address, routes and even data channel cipher The syntax is:
"PUSH_REPLY,%s\000" options_csv

In other words it's PUSH_REPLY followed by a comma, a comma separated list of configuration directives and finally a terminating NUL byte.

Configuration language

OpenVPN uses a surprisingly deceptive configuration language. On the surface it looks simple, but if you hit corner cases or look into the implementation and try to figure out its exact syntax things quickly get hairy.

# This is a comment
; this one, too
option argument1 "argument two"
another-option 'first "argument"' "second \"argument\""

For the purpose of this talk most of the quirks are not of interest (catch me offline if you want to hear more).

Backslash is used for escaping characters in double quoted and unquoted parameters. This also includes the line feed at the end of the line for unquoted parameters: the line feed is part of the parameter- but interestingly the configuration directive does not continue on the next line! With this we can use line feeds as the last character in the last parameter (there is otherwise no \n or similar escape sequence for line feeds)

As already alluded to the configuration language is (for the most part) line based. A configuration directive starts and ends on the same line. Except it's a bit more complicated than that! First step in parsing is to tokenize the configuration into "lines".

  • For configuration files fgets(3) is used. ⭐
  • For PUSH_REPLY it is split on commas. ⭐

The "line" tokenization done for PUSH_REPLY messages is done using OpenVPN's buf_parse().

bool buf_parse(struct buffer *buf, const int delim, char *line, const int size)
{
    bool eol = false; int n = 0, c;
    do
    {
        c = buf_read_u8(buf);
        if (c < 0)
            eol = true;
        if (c <= 0 || c == delim)
            c = 0;
        if (n >= size)
            break;
        line[n++] = c;
    } while (c);
    line[size-1] = '\0';
    return !(eol && !strlen(line));
}

In essence, buf_parse() "reads" from the buffer and copies over to line until it encounters delim, a NUL byte or the end of the buffer. In any case line is always NUL terminated. What is interesting is this effectively means a NUL byte is also considered a delimiter.

/*
 * Handle incoming configuration
 * messages on the control channel.
 */
void check_incoming_control_channel(struct context *c, struct buffer buf)
{
    /* force null termination of message */
    buf_null_terminate(&buf);

    /* enforce character class restrictions */
    string_mod(BSTR(&buf), CC_PRINT, CC_CRLF, 0);

    if (buf_string_match_head_str(&buf, "AUTH_FAILED"))
        receive_auth_failed(c, &buf);
    else if (buf_string_match_head_str(&buf, "PUSH_"))
        incoming_push_message(c, &buf);
    /* SNIP */
}
bool string_mod(char *str, int inclusive, int exclusive, const char replace)
{
    const char *in = str; bool ret = true;

    while (true)
    {
        char c = *in++;
        if (c)
        {
            if (!char_inc_exc(c, inclusive, exclusive))
            {
                c = replace; ret = false;
            }
            if (c)
                *str++ = c;
        }
        else
        {
            *str = '\0'; break;
        }
    }
    return ret;
}

It's not the easiest code to understand. For our purposes it reads all characters up to the first NUL byte and moves all printable characters (except carriage return and line feed) up front in the string.

/* Before */
"PUSH_REPLY,line \nfeeds\n,are\n,removed\n\000"
/* After */
"PUSH_REPLY,line feeds,are,removed\000ed\n\000"

Notice the "\000ed\n\000" at the end! The buffer is not truncated nor are the bytes at the end overwritten with NUL bytes!

Remember that a NUL byte is (by implementation) also a valid separator. With this we can send push options that may be interpreted in unexpected ways such as a --ed configuration directive in the above example. This could potentially be used for smuggling.

This example is not that interesting as it will just print

Unrecognized option or missing or extra parameter(s): ed

A perhaps more interesting is we can smuggle otherwise disallowed non-printable control characters at the end! This requires a modified implementation.

/* Before (\a is the escape sequence for BEL) */
"PUSH_REPLY,smuggling BEL\n,characters\a\n\000"
/* After */
"PUSH_REPLY,smuggling BEL,characters\000\a\n\000"

This will print in the client's logs

Unrecognized option or missing or extra parameter(s): ^G

The ^G is how my terminal rendes the BEL character. An audible bell can be heard. Thus we've proved we can inject arbitrary bytes into the client's terminal! This is a security risk as terminals from time to time have code injection vulnerabilities in this manner.

This was assigned CVE-2024-5594.

This was fixed in 2.6.11 and then redone in 2.6.12 (released Thursday).

The fix was to first "split" on NUL bytes to prepare for future changes that makes control channel messages less dependent on TLS record framing. Then messages are just rejected if they contain forbidden characters! 🥳

But wait! The AUTH_FAILED message can contain messages such as "please enter your 2FA code" generated by scripts that output newlines >:(
So we just broke a bunch of deployments :'(

So OpenVPN 2.6.12 is made more lenient and will chomp off any line feeds and carriage returns at the end of the message.

There are the more obvious lessons:

  • Be careful about inadvertently rescheduling timers
  • Be very careful about the languages you use in your protocols, and try make sure they are easy to implement safely

Diversity in implementations is a great way to exercise corner cases, push forward documentation efforts, get thorough code review by motivated peers

???

This signals the floor is open for questions (according to the computer).

Reynir Björnsson
https://reyn.ir/
https://robur.coop/
Contact us or me: team@robur.coop / reynir@reynir.dk

Thanks!