Decentralized Pairwise MLS
This is the second of two technical posts explaining Germ's technology. Previously, I introduced the Autonomous Communicator(AC) Protocol, built around Messaging Layer Security(MLS). In this post I’ll explain in further detail how we use MLS for 1:1 messaging within the AC Protocol.
Today, MLS is predominantly used in a centralized context, where a central entity hosts group state. Germ now leverages this powerful technology for decentralized 1:1 conversations, by letting users cooperatively advance group state.
Design Goals
On Germ, users trade profile cards to set up E2EE conversations; they can edit these cards and propagate these edits within the E2EE channel. In our identity layer that serves as an Authentication Service for MLS, we have a notion of evolving identity. Germ users can edit their cards, and propagate those edits by updating their own participant within a conversation to reflect their updated card. We want those updates to be bound to changes in the underlying MLS group.
We are deploying MLS for 1:1 conversations over an asynchronous, unreliable transport channel that may deliver messages out of order, in duplicate, or not at all. In addition, for users with sparse network connectivity, we want to ensure that they can encrypt messages and update their representation without first having to catch up to in-flight messages from the other party.
While MLS is tolerant of out-of-order application messages, it does require some consistency in members’ view of handshake messages (proposals and commits to changes to the group membership). Any member can mutate the group with a commit, but the group members must mutate the group in a consistent way. The simplest way to provide this consistency is for a centralized Delivery Service (DS) to order handshake messages.
However, Germ’s message transport service is unsuitable as a centralized DS. We don’t assume that any one service has visibility of all messages in transit between two parties (a user may use two services in parallel for reliability or privacy). Our transport service doesn't see any sender identities, nor can it identify which messages are part of the same conversation. So a centralized DS is unsuitable for the privacy and reliability goals of our transport architecture.
Our usage of MLS in 1:1 conversations fulfills the handshake message consistency function of the DS in a distributed manner among the two participants. In the remainder of this post, we’ll walk through a progression of alternative approaches to arrive at the shipping implementation in Germ, measuring them against these goals:
- Users can encrypt messages at any time without dependency on receiving additional messages from the remote party
- Users can propose identity updates at any time without dependency on receiving additional messages from the remote party
- No single third party can block users from sending new encryption keys for Forward Secrecy (FS) and Post-Compromise Security (PCS)
MLS terminology
Before we get into it, some essential MLS concepts:
MLS groups have members which are clients. The position clients occupy in a group is a LeafNode. The state of a group advances linearly through epochs. Within an epoch the membership is fixed, but members can propose changes to the membership, including changes to their own client. Any member can generate a commit that may include some proposals. A commit ends the current epoch and begins the next one.
You can learn more in the MLS architecture and protocol documents.
Progressive approaches towards our goals
Static encryption keys
A simple way to resolve state between two parties communicating asynchronously is to not have mutable state. Alice and Bob could exchange asymmetric encryption keys in their initial key exchange and continue to use those keys to encrypt messages to each other. This satisfies goal (1) above.
But in this case, users don’t have a mechanism to update their identity, nor even to update their encryption key material. So an attacker can store ciphertexts for later decryption if they can obtain a user’s private key material (no forward secrecy), or continue to decrypt messages after a one-time compromise of private key (no post-compromise security).
Centralized DS
MLS has a built-in mechanism to update members’ representations and introduce new key material for FS/PCS: update proposals and commits. If, instead of static encryption keys, we use an MLS group to back our 1:1 conversations, we can satisfy goal (2):
- Members of the group can propose (or directly propose and commit) changes to their own participant in this group (their LeafNode, in MLS parlance). This lets them advance their AC Identity and introduce new key material
We now need to reconcile conflicts if both members generate commits for the same epoch. A centralized DS can resolve these conflicts, either by choosing the canonical commit for each epoch, or consistently ordering commits within an epoch.
How closely does this satisfy our requirements?
- We can satisfy (1) if we have a relaxed policy on accepting messages from prior epochs. If Alice has been offline while Bob has updated their identity with a commit, Alice can still encrypt messages with her newest epoch.
- With a well-functioning DS, we can satisfy (2). While Alice can encrypt valid messages at any time, Alice won’t be able to send a commit the DS will accept until she catches up to state from when she was offline.
- A malicious DS, however, could misuse its position of privilege of adjudicating commits to block either Alice or Bob from receiving a valid commit. By introducing an external authority, we’ve given a third party that is neither Alice nor Bob the ability to weaken the FS/PCS properties of their conversation.
If Bob and Alice could agree on a consistent strategy of sending MLS proposals and commits, we could remove this external dependency. Asymmetric ratcheting suggests a solution - this is the approach we shipped in Germ DM 2.1.
Ratcheting Pairwise MLS
In the current state of the art for pairwise E2EE messaging, every message from Alice carries a new public encryption key. When Bob acknowledges receipt of this new in-flight key by using it to mix in new derived secrets, Alice generates a new key to attach to her messages, ratcheting forward the group’s encryption state.
We adapt this approach with MLS so that this ratchet also advances users’ identity in our identity layer. The Germ App operates MLS groups in a construction that we’re calling PairMLS, where each member of a 1:1 conversation takes turns proposing and committing updates.
Creation:
- Alice extends an invitation to create a PairMLS group by sharing an MLS client’s keypackage message with Bob. This keypackage belongs to an AC Protocol agent, has the agent’s public key as a MLS basic credential, and is signed by the agent’s private key.
- Bob can generate an agent of his own to correspond with Alice, similarly generate a MLS client for that agent, and create an MLS group composed of both clients.
- To prevent the epoch parity from revealing who extended the invitation, Bob randomly chooses to apply an empty commit before adding Alice's client to the MLS group he created. Suppose for the remainder of this example that he did not.
- Bob now has an MLS group in epoch 2, and a welcome message for Alice for epoch 2. Bob can now start encrypting messages to Alice in epoch 2, which she will be able to decrypt after processing the welcome
- Henceforth, even epochs are send epochs for Bob, and odd epochs are send epochs for Alice
- On receipt (and successful processing) of the welcome, Alice can start sending messages back to Bob in epoch 3 by creating and sending a commit. Here, Alice will always commit a change to her member of this group, replacing it with a new client corresponding to a new agent. An invitation may have been shared with multiple people, so Alice will always want to roll over to a new agent specifically for her conversation with Bob
Now we’re in the steady-state configuration:
- Alice and Bob are each assigned a sending epoch parity. They only process incoming application messages in expected epochs.
- Every message carries an update proposal for the sender. This update may simply update the sender’s MLS LeafNode, or it may also update the sender’s credential in the AC identity layer. Users can send different proposals within an epoch.
- Each member is responsible for generating and sending the commit to start their sending epoch. The commit contains a proposal they received in the previous epoch.
So to continue the previous example:
- Bob can stream messages to Alice in epoch 2, Alice can stream messages to Bob in epoch 3. Alongside each of these messages in epoch 3, Alice proposes an update to her LeafNode.
- As Bob receives proposals from Alice, he selects one in an ordering determined by the Authentication Service (the AC Identity Layer), preferring identity updates, to agent updates, to LeafNode updates.
- When Bob is ready to send a message after receiving some messages in epoch 3, he commits his chosen update from Alice, and sends the commit back to Alice alongside a proposal and application message of his own.
Message Framing
We’ve built on an assumption that we can’t guarantee in-order or reliable message delivery, so we have to ensure each datagram can be decrypted by the other party without dependency among messages in-flight.
We do this by, for each message, stapling the application message with the sender’s proposal in this epoch, and the sender’s commit that starts this epoch. We use the MLS authenticated data (AD) field to do so — so that the application message encloses the proposal in its AD, and the proposal itself includes the commit in its AD.
Header Encryption
This packaging still leaks some metadata in the MLS private message format - namely the group ID and epoch counter. We hide this metadata by encrypting the entire outer MLS application message with a symmetric key derived from the other party’s last known (send) epoch.
In Summary
Every message Bob sends Alice is
- in an epoch k of the correct parity as assigned at group creation
- conveys the commit message that Bob generated to start epoch k. within the AD field of:
- an update proposal for Bob’s LeafNode in epoch k, itself contained in the AD field of:
- an MLS application message in epoch k. This application message contains any required AC Protocol identity or succession assertions to support the stapled update proposal message.
- This application message is further symmetrically encrypted with a symmetric key derived from epoch k-1
Alternative Approaches
Self-Commit instead of Propose-then-Commit
An MLS commit can update the sender's identity, so an alternative to Alice sending update proposals for Bob to pick one to commit would be for Alice to commit her own LeafNode update to start each of her epochs.
However, we want to allow users to have multiple in-flight identity proposals. For example, Alice edits her card, sends a message to Bob conveying this edit, then edits her card again and sends another message, before Bob has replied to confirm/commit the first edit.
Under a self-commit scheme, Alice would have already committed the first edit. She would have to wait for a reply (and commit) from Bob to send another edit. Alternatively, she could send another commit, and try to infer from Bob’s response which one he applied.
We choose, instead, to allow users to send multiple identity updates in-flight and to keep this branching state within MLS’s propose-then-commit semantics.
Double Ratchet
Our resulting construction, using unmodified MLS operations, resembles Double Ratchet with Header Encryption, since they share many goals - eagerly sending new asymmetric keys to provide forward secrecy and post-compromise security, and hiding message metadata in transit. We rely on the MLS application key schedule for the symmetric ratchet within an epoch. Our asymmetric ratchet happens at the same cadence, when a new LeafNode proposal (like a ratchet key) I generated has round-tripped from me to the other party, and back, covered by a commit message.
“I go by Mark now, not Ming”
“I heard you, Mark”
Our primary motivation for this construction was to allow this ratchet to drive reliably advancing state in our identity layer, bound to the epochal advance of the underlying MLS group. The user-facing mechanic of proposing changes to my identity, reflected back by a confirmation from the other party, maps well to MLS’s propose and commit mechanism.
As a bonus, our construction allows for the more rapid introduction of new key material to the group state in two ways:
- Within an epoch, users aren’t committed to proposing the same new key material (LeafNode), and can propose newer LeafNodes without waiting for a response from the other party. This is a consequence of the decision to allow people to propose multiple card edits without waiting for a confirmation in between.
- An MLS commit updates the sender LeafNode in addition to covering a proposal for the other party. In contrast with the asymmetric ratchet in Double Ratchet, which only updates the newly received remote ratchet key, our MLS commit step updates both parties’ LeafNodes. We make this possible by stapling our commit to every message.
By building our PairMLS construction from unmodified MLS operations, we extend upon the foundation of analysis and research that has already gone into MLS. We look forward to analysis and feedback on our construction of 1:1 conversations using MLS without a DS — please reach out.