TLS 1.3 Status

The final TLS 1.3 specification has now been published as RFC 8446. This is not implemented yet in BearSSL; this page lists the main missing elements, and discusses some of the hurdles implied by this new protocol version. It will be updated as implementation efforts are ongoing.

Cryptographic Algorithms

HKDF: TLS 1.3 switches the internal PRF (for key derivation purposes) to HKDF (RFC 5869). HKDF is now implemented in BearSSL generically.

AES/GCM and ChaCha20/Poly1305: these content encryption algorithms (with integrity check) are already implemented for TLS 1.2. The record format has slightly changed, and will need a new implementation:

The per-record nonce is now entirely implicit (it is the record sequence number), which differs from AES/GCM records in TLS 1.2.
The outer (non-encrypted) record type is always 23 (“application data”), and the record now contains an inner (encrypted) type.
While GCM and ChaCha20 don’t need padding, TLS 1.3 allows for extra padding bytes, as a crude mitigation for traffic analysis. Incoming padding must be supported and properly removed. Support for outgoing padding would require an extra API, probably as a modified version of br_ssl_engine_flush().

ECDHE: BearSSL already implements ECDHE for TLS 1.2. These algorithms are unchanged for TLS 1.3; the new specification now mandates a single point format, which is already the one implemented by BearSSL (single X coordinate for Curve25519, two-coordinates uncompressed format for the NIST curves).

RSA/PSS: When using RSA signatures, TLS 1.3 now uses PSS (specified in PKCS#1 version 2.1 and later) instead of the old-style “v1.5” format which is used in TLS 1.2. BearSSL does not implement PSS yet. The implementation does not seem to be overly complex, but there is some API design work ahead:

RSA/PSS has parameters, which are the hash function used to hash the signed data, a mask generation function, and a salt length (in octets). Everybody uses a single mask generation function, called MGF1; that function relies on an underlying hash function which may or may not be the same as the hash function used to hash the data. In TLS 1.3, MGF1 is used, the two hash functions are the same, and the salt length is equal to the hash output length. However, use of RSA/PSS for signatures on X.509 certificates by CA might differ.
PSS parameters will need some specific decoding code (when found in X.509 certificates), and the X.509 validation engine API must have a way to convey back the decoded parameters. Indeed, when PSS is used, public keys in certificates may be tagged with either the “rsaEncryption” OID, or the RSASSA-PSS OID (as specified in RFC 5756); in the latter case, this restricts use of the key with PSS only, and there may be explicit PSS parameters, which must then match the actual use of the key. TLS 1.3 furthermore makes explicit the OID difference: when clients and servers designate signature algorithms, different symbolic identifiers are used for rsaEncryption keys (e.g. 0x0804 for RSA/PSS with SHA-256 and a key tagged “rsaEncryption”) and for RSASSA-PSS keys (0x0809 for such a key with RSA/PSS and SHA-256).

EdDSA: TLS 1.3 promotes the use of EdDSA, a Schnorr-like signature algorithm over Edwards curves such as Edwards25519 and Edwards448. BearSSL does not currently implement EdDSA; however, it implements Curve25519, which uses the same base field (indeed, Curve25519 is “birationally equivalent” to Edwards25519, which means that, for security purposes, it is the same curve, with a different representation). While EdDSA should allow for faster implementations than ECDSA (mostly because of the different base curve), it has a few implications, some of which unfortunate:

Ed25519 (i.e. EdDSA on curve Edwards25519) is defined to use SHA-512. This means that the relatively bulky SHA-512 implementation will be pulled in the code, even if the rest of a given TLS deployment uses SHA-256. SHA-512 is also quite slower than SHA-256 on small 32-bit architectures.
Ed448 (i.e. EdDSA on curve Edwards448) is defined to use SHAKE-256, i.e. a derivative of the new SHA-3 (Keccak). BearSSL does not currently implement SHAKE; I already have a ready to use implementation, but that means an increase in code footprint.
Curve448 is not currently implemented in BearSSL. Since that curve is relatively expensive on small hardware (much more than Curve25519), its implementation might be delayed to a later BearSSL version.
The X.509 validation code does not currently know how to decode public keys for Edwards curves from X.509 certificates. The extra code should not be complex, but the relevant standards are fairly new, and potential interoperability issues are not known yet.
A bigger issue with the use of Ed25519 or Ed448 in X.509 certificates is that these signature algorithms use an extra protection against collision attacks on hash functions: the initial hashing is done not on the signed data alone, but on the concatenation of the encoding of the public key and the signed data. This is called “PureEdDSA” in RFC 8032 terminology, and it makes the signature function more resilient against collision attacks, if the hash function turns out to be flaky in that respect (like MD5 or SHA-1). However, it means that the public key must be known before starting to process the signed data. In an X.509 certificate path provided on a stream in usual SSL/TLS order (end-entity first, then CA), the public key is made available only after the whole signed certificate has been received. Verifying a certificate path that involves use of EdDSA keys by CA thus requires buffering a complete certificate in RAM, something which has so far been carefully avoided by BearSSL.

This buffering issue does not occur with the use of EdDSA in the TLS 1.3 protocol itself, because it is then defined that the “signed data” is already an aggregate structure that contains the hash of the handshake messages exchanged so far.

Handshake Messages

The TLS 1.3 handshake is substantially different from the handshake used in previous versions; this will require some specific code, most of which written in T0. An as yet open question is whether TLS 1.3 support will be made separate from TLS 1.2 or not: an engine that supports only TLS 1.3 could potentially be smaller (in terms of code footprint) than an engine that supports both TLS 1.2 and 1.3; however, if both protocol versions are to be supported, there should be some substantial code sharing between the two.

The TLS 1.3 ClientHello and ServerHello include several new extensions, e.g. for protocol version support and ECDHE key exchange.

When the client sends a ClientHello and the server does not find in it a suitable “key share” (e.g. the curve used by the client is not supported by the server), the server sends an HelloRetryRequest that may include a “cookie”, which is an opaque blob of data that the client must send back in its new ClientHello. The cookie size may be up to 64 kilobytes. In the context of BearSSL, where no memory is allocated dynamically and the overall RAM usage is kept as low as possible, large cookies cannot be supported. It is currently unclear what is the size of a “normal” cookie; this will require interoperability tests with other implementations.

Similarly, session resumption (based on the session ID) is deprecated in TLS 1.3; instead, a session ticket mechanism is to be used. The server issues a new session ticket at any time, with an opaque “label” that the client may use in a subsequent ClientHello message. The label is an opaque blob up to 64 kilobytes in length; again, this is way too much for BearSSL usage context. Session tickets may be issued by servers at any time, including after the post-handshake authentication; as such, it is expected that a server that authenticates clients with certificates, and wants to support stateless resumption, may include in the ticket a copy of the client certificate. It is thus expected that ticket size may go up to several kilobytes in practice. How tickets will be handled by BearSSL is an open question. We may note that, depending on server behaviour with regards to ticket size, there may be situations in which TLS 1.3 sessions cannot be resumed because of oversized tickets, but TLS 1.2 resumption with the session ID would work; thus, TLS 1.3 is not always better or even at least as good as TLS 1.2 (corollarily, TLS 1.2 support will not be deprecated in BearSSL).

0-RTT

0-RTT is a new feature of TLS 1.3, that promises the support of a client sending data to a server without any prior round-trip, thus minimizing latency. While the (perceived) better performance is alluring, that support is problematic in several ways:

0-RTT is inherently weak against replay attacks. An attacker may copy a 0-RTT request from a client, and send it again and again to a server to repeat the effect. The TLS 1.3 specification sorts of waves the problem away by claiming that it should be allowed only for “idempotent” requests, without actually defining that term. In most cases, communication protocols that use TLS for transport are not idempotent. An alternative is to make the server remember 0-RTT requests, and reject requests which have already been seen or are too old; however, such a mechanism requires extra RAM and a clock, both resources being scarce or unavailable on the systems that BearSSL targets.
To make 0-RTT somewhat safer, the API should at least convey to the application the boundary between the “early data”, that may be replayed by an attacker, and subsequent data, which is not. The current BearSSL API has no provision for such an indication.
When 0-RTT fails (e.g. the client tries to reuse a secret value that the server has forgotten), the server will get decryption errors, that it is supposed to ignore: this breaks the abstraction layer between the record API and the handshake engine, in that records may now have to be potentially ignored, and decryption errors no longer imply sending the engine to a “failed” state.

BearSSL support for TLS 1.3 will likely not include support for 0-RTT, at least initially.

Certificate Validation

TLS 1.3 changes the format of the Certificate message, to include per-certificate extensions such as embedded OCSP responses; some API change is needed to support these extensions (if only to skip them).

More problematic, TLS 1.3 changes the policy on chain ordering:

Note: Prior to TLS 1.3, “certificate_list” ordering required each certificate to certify the one immediately preceding it; however, some implementations allowed some flexibility. Servers sometimes send both a current and deprecated intermediate for transitional purposes, and others are simply configured incorrectly, but these cases can nonetheless be validated properly. For maximum compatibility, all implementations SHOULD be prepared to handle potentially extraneous certificates and arbitrary orderings from any TLS version, with the exception of the end-entity certificate which MUST be first.

The current X.509 validation engine in BearSSL (br_x509_minimal) can processing arbitrarily long certificate chains in a fixed amount of RAM precisely because it relies on the chain being sent in the TLS 1.2 order, i.e. end-entity first, and each subsequent certificate being for the CA that issued the previous one. This policy being deprecated in TLS 1.3, it can be expected that RAM-efficient X.509 support in BearSSL may fail to connect to a number of deployed servers.

This does not impact closed deployments, in which both clients and servers are under sufficient control to enforce a strict certificate chain ordering (or, even better, switch to a “known key” model in which clients already know the server’s public key, and skip all this X.509 business).

Full certificate validation support, compatible with the new TLS 1.3 policy, will require a more advanced X.509 validation engine, which will unfortunately have to allocate memory dynamically (possibly in an application-provided area dedicated to that usage, or with the dreaded malloc() call). This would break some of the nice features of BearSSL, in particular the strong guarantee on proper memory management. If I ever write such an X.509 engine, it will probably be as a separate library (BearSSL’s API is pluggable, so external engines are perfectly feasible).

As explained above, buffering of certificates may also be needed for EdDSA support in certificates. Dynamic memory allocation could unlock some features, in particular support for named constraints, certificate policies, revocation, hooks for dynamic download of extra certificates, CRL and OCSP responses, arbitrary path building… Writing such a code in C would be a thoroughly bad idea, though¹. Moreover, extra care should be taken about denial-of-service attacks (both about RAM usage, and computational cost).

Roadmap

TLS 1.3 support in BearSSL will need to go through the following steps:

Find an existing implementation with TLS 1.3 support, for interoperability tests. There are some candidates, though most only implement some draft versions (draft 28 should be close enough to the final RFC, though).
Write a perfunctory implementation in BoarSSL, both client and server. This will be needed for automatic testing in Twrch. Ideally, an adequate test command would be written for the external implementation used for interoperability tests.
Implement RSA/PSS.
Implement EdDSA, at least on curve Edwards25519.
Extend the API and implementation for X.509 certificate validation, to include support for RSA/PSS and reporting of relevant PSS parameters.
Implement the new TLS 1.3 encrypted record format (with the extra encrypted content type byte, and padding).
Implement the new TLS 1.3 handshake, in normal 1-RTT mode. Depending on how this step goes, a decision will be made about splitting the code into an independent TLS 1.3-only engine, or keeping a shared engine that can handle everything from 1.0 to 1.3.
Tests, extra tests and more tests. This step is expected to take some non-negligible amount of time.
Implement Curve448 and Ed448; this implies including a SHAKE implementation.
Write an external X.509 validation engine, pluggable into BearSSL, with buffering, controlled dynamic memory allocation, and advanced features. This is a considerable endeavour, that may imply drastic decisions, up to (and including) defining a new programming language dedicated to that task.

I know that because I already did it some years ago, and I’ll probably pay for it in the afterlife.↩