Strobe protocol framework

Scope

This spec describes the operation of the Strobe framework. It only covers the symmetric portion. For applications including elliptic curve crypto, see the examples page.

Table of Contents

Version

Definitions and notation

Formatting

Strobe Instances

Strobe parameters

State of a Strobe object

Initial state

Strobe operations

Low-level operations

AD: Provide associated data

KEY: Provide cipher key

CLR: Send or receive cleartext data

ENC: Send or receive encrypted data

MAC: Send or receive message authentication code

PRF: Extract hash / pseudorandom data

RATCHET: Prevent rollback

Operations and flags

Flags for the defined operations

Composite operations

Implementation of operations

Duplexing and running F

Initialization

Beginning an operation

Main operation code

Version

This specification describes Strobe version 1.0.2. It has been revised for clarity since that version was released, and to add a usage hint for meta_RATCHET. Special thanks to David Wong for his feedback on the specification.

Definitions and notation

Strobe is useful for building both symmetric cryptosystems and protocols. For simplicity, these are both referred to as "protocols". For example, in an encryption protocol, Alice sends an encrypted and authenticated message to Bob, but Bob does not send any messages back.

Strobe protocols exchange data over some sort of transport. For most protocols

To disambiguate the two parties of a network protocol, Strobe assigns them each a role as an initiator or responder.

Parties start out as undecided, and become an initiator or responder if and when they send or receive a message via the transport.

Strobe operates exclusively on bytes, which are elements of the set [0,...,255]. However, sometimes a length field is required which is larger than one byte. Furthermore, in extensions of Strobe, such as the key tree, we will need elements smaller than one byte. In either case, Strobe follows a little-endian convention.

Formatting

Variables are formatted like this.

Inline code is formatted like this.

Blocks of code are formatted like this.

Non-normative comments are formatted like this.

Non-normative warnings are formatted like this.

Strobe Instances

There are several paremeters in the Strobe framework. It can use many different rates, permutations and padding modes. These are detailed in the paper. This specification only describes the following recommended instances:

Strobe-128/1600 and Strobe-256/1600 are based on the cSHAKE specification from NIST.

Strobe parameters

Let b be either 400, 800 or 1600. Let F be the function Keccak-f[b]. Let N = b/8. Strobe treats F as a function which takes as input an array of N bytes and returns another array of N bytes.

Let sec be a target security level, either 128 or 256 bits. Strobe has 2*sec bits of secret state. As a result, it is somewhat stronger than a block cipher with sec-bit key. Depending on the exact protocol, it may achieve security comparable to a 2*sec-bit block cipher (as in this paper). In other scenarios — particularly when used as an unkeyed hash — it only has sec bits of security. A future version might add a way to change the rate after a key has been entered.

Let R = N - (2*sec)/8 - 2. This will be the number of bytes in a Strobe block. It is required that 1 ≤ R < 254. This means that b = 400 is incompatible with sec = 256. Obviously R ≥ 1 is required so that Strobe can get any work done at all. The requirement that R ≤ 254 is so that offsets can be represented in one byte. In the existing protocol, this can result in offsets up to R, which would imply that R ≤ 255. The requirement R ≤ 254 is a hedge to allow future modifications. In any case, it is automatically true for any Keccak-based protocol, because R < 1600/8 = 200.

Let X, Y and Z be single digits, representing a major, minor and patch version of Strobe.

The parameters below define the protocol framework called "Strobe-Keccak-sec/b-v1.0.2", which is abbreviated to "Strobe-sec/b". In general, the name would be "Strobe-F-sec/b-vX.Y.Z".

State of a Strobe object

A Strobe object has the following state variables:

Initial state

The initial state of the object is as follows:

Following this initialization, the first operation must be a meta-AD operation whose data is a protocol-specific string. This separates Strobe instances from each other. This operation is also known as customization, personalization, domain separation or diversification.

Strobe operations

A Strobe protocol execution is composed of a sequence of operations. There may be any number of operations in any order. Which operations are performed may be determined at runtime. Each operation can process an arbitrary amount of data, but always in multiples of one byte. The data is processed in a streaming fashion, one byte at a time. The protocol can determine the amount of data to process on the fly, even after an operation has begun. In python notation, the sequence

ctx.send_ENC("A long");
ctx.send_ENC(" message",more=True);

will do exactly the same thing as

ctx.send_ENC("A long message");

but not the same thing as

ctx.send_ENC("A long");
ctx.send_ENC(" message");

This last piece of code performs two separate ENC operations, whereas the others each only do one operation.

Strobe's operations are low-level, and a protocol will be a combination of many of them. For example, an AEAD system might be comprised of a key (KEY), an associated datum (AD), an encrypted message (ENC) and a message authentication code (MAC). It might use further operations for framing, context and metadata. This specification first describes the operations, and then recommends how to combine them.

Strobe is based on Keccak-f. Keccak-f is somewhat slow in software, and it can have a rather wide block size — up to 168 bytes of rate. Strobe operations may be rather small. In particular, they are used for framing data, which is generally only a few bytes. To reduce overhead, Strobe packs multiple operations into one block when possible. It is not always possible, though. In particular, operations such as ENC and MAC must produce output that depend on all previous inputs. For such operations, Strobe runs F and begins a new block.

The paper specifies a variant, called "Strobe lite", which always begins a new block for every new operation. This is designed for lightweight hardware, where F is probably much smaller and there is little gain from packing multiple operations into one block.

Low-level operations

Strobe supports 7 low-level operations. Of these, the 3 which use the transport exist in send and recv directions.

If Alice sends a message to Bob, Alice will use a send operation (such as send_ENC) and Bob will use the corresponding recv operation (recv_ENC). Strobe takes this into account when hashing the operations, so that Alice's duplex state will match Bob's when they execute corresponding operations.

For any operation, there is a corresponding "meta" variant. The meta operation works exactly the same way as the ordinary operation. The two are distinguished only in an "M" bit that is hashed into the protocol transcript for the meta operations. This is used to prevent ambiguity in protocol transcripts. This specification describes uses for certain meta operations. The others are still legal, but their use is not recommended. The usage of the meta flag is described below in Section 6.3.

The operations are:

This section describes the rough behavior of these operations, not how they are implemented. The exact behavior and implementation are described in Section 7. Each operation also lists the flags which describe it. Their meaning is described in Section 6.2.

AD: Provide associated data

The AD operation adds associated data to the state. This data must be known to both parties, and will not be transmitted. Future outputs from the Strobe object will depend on the supplied data.

KEY: Provide cipher key

The KEY operation sets a symmetric key. If there is already a key, the new key will be cryptographically combined with it. This key will be used to produce all future cryptographic outputs from the Strobe object.

Every crypto operation in Strobe depends on a running hash of all data which has been entered into it. As a result, the KEY operation has very similar semantics to AD. Both of them absorb data without transmitting it, and affect all future operations. That said, KEY differs from AD in three ways:

CLR: Send or receive cleartext data

send_CLR sends a message in clear text. recv_CLR receives a message in clear text.

The recv_CLR and recv_meta_CLR operations don't verify the integrity of the incoming message. For this, follow send_CLR with send_MAC on the sending side, and follow recv_CLR with recv_MAC on the receiving side.

In some scenarios, one cannot check a MAC for protocol or performance reasons. For example, until the two parties share a secret key, a MAC defends only against accidental corruption and not against malicious modification.

Strobe explicitly supports MACs on framing data. However, to save bandwidth, most protocols do not MAC their framing data until they have sent/received the framed data as well. This means that extra care must be used with framing data, especially lengths.

ENC: Send or receive encrypted data

send_ENC encrypts a message and sends it to the transport. recv_ENC receives a message from the transport and decrypts it. The encryption does not include authentication. The length of the encrypted message is the same as the length of the plaintext message.

Strobe's encryption mode keeps the message confidential (except for its length) so long as the session transcript up to this point is secret and, for the sender, unique.

The recv_ENC and recv_meta_ENC operations don't verify the integrity of the incoming message. For this, use send_MAC after send_ENC on the sending side, and recv_MAC after recv_ENC on the receiving side. The receiving side must run recv_MAC before using the decrypted message.

Newcomers to cryptography often assume that an attacker can't modify encrypted messages without turning them into gibberish. This isn't true for most cipher modes, and it isn't true for Strobe. The MAC operation really is necessary.

Strobe's encryption mode requires unique nonces for security. In other words, the operations before the ENC operation must be unique. In a two-party protocol, make sure that a party has contributed a unique value (such as a random nonce or a Diffie-Hellman ephemeral) to the protocol before using send_ENC. It is not necessary to contribute a unique value before using recv_ENC, except to defend against physical side channels such as DPA.

Protocols are allowed to use ENC before any KEY has been entered. This usually isn't secure as an encryption mode. For technical reasons, the recommended Schnorr signature mode does this whether or not a key has been entered.

MAC: Send or receive message authentication code

send_MAC computes and sends a message authentication code (MAC). recv_MAC receives and checks a MAC. If the MAC doesn't match, the receiving party aborts the protocol.

A MAC operation is ineffective if it is too short. That is, an adversary can just guess a B-byte MAC value with probability 2-8B. A 16-byte or longer MACs is suitable for protocols on powerful devices, and 8-byte or longer MAC is suitable for constrained devices. Very constrained environments might use even shorter MACs, but be aware that a short MAC may be practical to guess.

When retrofitting security on top of legacy embedded protocols, designers might not have enough bits to send a secure MAC. It is possible to send a shorter MAC in each packet, with the hope that an attack will be detected within a few packets. The Strobe framework allows this, but of course it is very dangerous. Don't try it without an expert analysis.

Checking a MAC must be done in constant time to prevent side-channel attacks. This is particularly salient when checking a MAC that is streaming in one byte at a time. It is very important not to indicate an error until the MAC operation is actually complete. Otherwise an attacker will learn information about the correct MAC. If the attacker can make several attempts, this side channel will lead to an easy compromise.

Data isn't automatically trustworthy just because it has been MAC'd or signed. At most, the MAC will indicate who sent the data. But that other party might still be evil, or might have been compromised.

Strobe's operations all support byte-by-bytes streaming, and do not require that the length be known in advance. The MAC operation is no exception. That means in particular that truncating a MAC produces a valid, shorter MAC. This problem can and should be avoided by using a composite operation which includes the MAC's length in the metadata, so that the MAC will depend on its length.

However, this note doesn't rise to the level of a warning. Most protocols will use fixed-length MACs, but some will not. Suppose the two parties somehow disagree on how long their MACs are. Suppose that Bob thinks the MACs are 8 bytes, but Alice thinks they're 10 bytes. Then the first message that Alice sends to Bob will verify with a truncated MAC, but any message that either sends later in the protocol will fail, because of the different transcript before that point. This is unlikely to cause a huge security problem.

The reverse direction is also problematic: if Alice thinks the MACs are 8 bytes, but Bob thinks they're 10 bytes, then an attacker can extend Alice's MAC to one that Bob will accept with probability 2^-16. Again, any future MAC will fail with high probability.

Strobe's MAC operations don't require session uniqueness. Repeating a nonce will compromise send_ENC, but (unlike AES-GCM and Poly1305) it will not compromise send_MAC or recv_MAC.

This only applies to attackers trying to find MACs on new data. If an attacker replays the same MAC after the same data in an identical session, then of course the MAC will verify.

PRF: Extract hash / pseudorandom data

PRF extracts pseudorandom data which is a deterministic function of the state. This data can be treated as a hash of all preceeding operations, messages and keys. For example, the following code hashes a single block of data:

ctx = new strobe(proto="example hash")
ctx.AD("message to be hashed")
the_hash = ctx.PRF(output_length)

One can also hash multiple blocks of data, for example:

ctx = new strobe(proto="example hash")
ctx.AD("message to")
ctx.AD(" be")
ctx.AD(" hashed")
the_hash = ctx.PRF(output_length)

which produces a different result from the previous example, because Strobe hashes where the operations begin and end in addition to their data.

Just as with a MAC, the PRF operation supports streaming, and a shorter PRF call will return a prefix of a longer one. Use a composite operation that inputs the length of the PRF first.

RATCHET: Prevent rollback

RATCHET has no input other than a length L, and no output. Instead, it modifies the state in an irreversible way.

Strobe's F is Keccak-f, which is a permutation. If an attacker compromises the state at some point in the protocol (e.g. with a side channel attack or an application exploit), then the attacker can use F-1 to solve for states at earlier points in the protocol.

The RATCHET operation prevents this by zeroizing L bytes of the state. Zeroizing the whole state would destroy the key, so RATCHET only zeroizes up to R bytes at a time, calling F in between. In order to reverse the RATCHET operation, the attacker would have to guess what the bytes were before zeroizing. Setting L = sec/8 bytes is sufficient when Rsec/8. That is, set L to 16 bytes or 32 bytes for Strobe-128/b and Strobe-256/b, respectively.

If L is set to a very large number, then this operation wastes time by calling F many times in a row. This makes it suitable for a password-based key derivation function (PBKDF), with security comparable to PBKDF2.

Modern PBKDFs such as scrypt and Argon2 are designed to be memory-hard, meaning that they require a large memory with high bandwidth to perform efficiently. Memory hardness makes it harder to guess passwords by brute force using GPUs or ASICs. Using RATCHET as a PBKDF isn't memory-hard. But of course, a tiny device with constrained memory can't efficiently compute a memory-hard function, so this may be the best you can do.

For all the recommended Strobe variants, Rsec/8. In non-recommended variants with R < sec/8, this operation is ineffective at preventing rollback because it only erases R bytes at a time. Instead, extract L bytes using PRF, and then stir them back in with KEY. In this scenario, RATCHET is still suitable to waste time, but the PRF/KEY ratcheting operation must be used afterward to prevent rollback.

Note that length=0 will not erase any bytes and will not prevent rollback: it only aligns to the beginning of the block. On the other hand, this usage is faster and doesn't require knowledge of R, which is otherwise abstracted from the designer.

Setting length=R prevents rollback, and allows memory-constrained implementations to cache the state before the call to F, which takes more time but less memory.

Operations and flags

The behavior of each of Strobe's operations is defined completely by 6 features, called flags. The operation is encoded as one byte, where the least significant 6 bits are its flags. The flags and their encodings are as follows:

Flags for the defined operations

Each operation type is uniquely described by a combination of the above flags:

OperationFlags
AD A
KEY AC
PRF IAC
send_CLR AT
recv_CLR IAT
send_ENC ACT
recv_ENC IACT
send_MAC CT
recv_MAC ICT
RATCHET C

The meta variants of the operations are the same as the non-meta variants, but they additionally have the M flag set.

The other 6 combinations of the IACT flags are legal and their behavior is well-defined, but they aren't as useful.

Composite operations

Composite operations enable protocols to comply with the Horton principle: "authenticate what is being meant, not what is being said". They do this by adding metadata to the operation which describes its meaning. How to encode that meaning is up to the protocol, so the metadata may be in any number of meta operations of any lengths. The most straightforward encoding would be a single-byte tag describing the meaning of the data, and a fixed number of bytes describing the length of the data in some particular endianness.

The metadata is processed before the data. This means that PRF outputs, encrypted data and MACs will depend on the metadata as well as on previous operations.

In a complex protocol, it might be possible to send more than one kind of message at a given point. It is safe for the recipient to read the framing data — either all at once or byte-by-byte — to determine which kind of message to receive. However, the recipient must only accept a message if it is appropriate at the current stage of the protocol.

A composite operation is a sequence of two or more operations, exactly one of which is designated as the payload. The sequence consists of:

Once a key has been provided, it is recommended for at least the last operation of every flow to be a MAC operation. In some circumstances, it is also important to provide a MAC before the end of the flow. Consult a cryptographer if you are unsure.

It is usually preferable to use fixed-length MACs, instead of putting their lengths in framing data. There are three principal exceptions:
In any case, protocols must not accept MACs below a fixed minimum length, which should not be less than 8 bytes.

A composite operation consists of one or more meta operations, followed by exactly one non-meta operation. It is recommended not to use meta operations except to implement these composite operations. That way, the protocol transcript can be parsed uniquely into a sequence of (composite and possibly non-composite) operations.

In a complex protocol, it might be possible to send more than one kind of message at a given point. It is safe for the recipient to read the framing data — either all at once or byte-by-byte — to determine which kind of message to receive. However, the recipient must only accept a message if it is appropriate at the current stage of the protocol.

Implementation of operations

This section describes formally what each operation does and how it is implemented. The implementation is described by Python code. [TODO: and math?]

Duplexing and running F

Strobe is a duplex construction, so its core routine _duplex() is somewhat unsurprising. This routine takes input data and three binary arguments: cbefore, cafter and forceF.

If cafter is set, then the data is modified by the duplex construction after absorbing it. This is used for encryption. If cbefore is set, the data is modified by the duplex construction before absorbing it. This is used for decryption. There is no case where both cbefore and cafter are set.

Certain operations need to run F afterward, in order to begin a new block. For brevity, they do this by passing a forceF argument to _duplex().

def _duplex(self, data, cbefore=False, cafter=False, forceF=False):
    # This is an internal function.  It's not part of STROBE's API.
    assert not (cbefore and cafter)
    
    # Copy data, and convert string or int to bytearray
    # This converts an integer n to an array of n zeros
    data = bytearray(data)

    for i in range(len(data)):
        if cbefore: data[i] ^= self.st[self.pos]
        self.st[self.pos]   ^= data[i]
        if cafter:  data[i]  = self.st[self.pos]
        
        self.pos += 1
        if self.pos == self.R: self._runF()
    
    if forceF and self.pos != 0: self._runF()
    return data

When the rate is exceeded, or when beginning an operation that uses the cipher state, Strobe runs the sponge function F on the state. This begins a new block. Because posbegin records the beginning of an operation within the block, Strobe absorbs it and resets it when beginning a new block:

def _runF(self):
    # This is an internal function.  It's not part of STROBE's API.
    if self.initialized:
        self.st[self.pos]   ^= self.posbegin
        self.st[self.pos+1] ^= 0x04
        self.st[self.R+1]   ^= 0x80
    self.st  = self.F(self.st)
    self.pos = self.posbegin = 0

The 0x04 and 0x80 pads are cSHAKE's output padding mode. For simplicity, this is applied on every block whether it has output or not. [posbegin] is Strobe's padding mode on top of cSHAKE's padding.

Strobe uses cSHAKE's domain separation to ensure that it is distinct from all other uses of Keccak. It then uses its own internal separation — a meta_AD operation — to ensure that every Strobe-based protocol is distinct. This first separation uses cSHAKE's specified padding mode instead of Strobe's padding. This is why _runF checks self.initialized.

Initialization

Strobe's initial state is set using cSHAKE's domain separation. The cSHAKE domain separation string is S = "STROBEv1.0.2". The NIST separation string is N = "", because Strobe wasn't designed by NIST.

def __init__(self, proto, F = KeccakF(1600), sec = 128):
    self.pos = self.posbegin = 0
    self.I0 = None
    self.F  = F
    self.R  = F.nbytes - sec//4
    self.cur_flags = None
    
    # Domain separation doesn't use Strobe padding
    self.initialized = False
    self.st = bytearray(F.nbytes)
    domain = bytearray([1,self.R,1,0,1,12*8]) \
           + bytearray("STROBEv1.0.2")
    self._duplex(domain, forceF=True)
    
    # cSHAKE separation is done.
    # Turn on Strobe padding and do per-proto separation
    self.R -= 2
    self.initialized = True
    self.operate(A|M, proto)

Beginning an operation

Beginning an operation absorbs the old posbegin state variable, and then sets it to where this operation began. It then absorbs the operation, adjusted so that the sender and receiver will compute the same value.

def _beginOp(self, flags):
    # This is an internal function.  It's not part of STROBE's API.
    # Adjust direction information so that sender and receiver agree
    if flags & T:
        if self.I0 is None: self.I0 = flags & I
        flags ^= self.I0

    # Update posbegin
    oldbegin, self.posbegin = self.posbegin, self.pos+1
    
    self._duplex([oldbegin,flags], forceF = flags&(C|K))

Forcing F when flags & (C|K) is nonzero ensures that any input — including the operation's flags — will affect the output of C-flagged operations like ENC, MAC and PRF. It also ensures that the rekeying operations KEY and RATCHET begin at the start of a block. This is important to prevent rollback attacks: if a KEY or RATCHET were split across blocks, then an adversary who recovered a later state could split the work of brute-forcing the earlier state into separate attacks on each block.

Main operation code

To perform an operation, Strobe first runs the beginning-of-operation code. It then duplexes the data with cbefore and cafter set according to the operation's flags. Finally, it decides what to do with the output: return it, ignore it, or check it as a MAC value.

The implementation below supports a more flag for streaming purposes. Setting more processes the given data as a continuation of the previous operation, instead of as a new operation.

def operate(self, flags, data, more=False):
    assert not (flags & (K|1<<6|1<<7)) # Not implemented here
    if more:
        assert flags == self.cur_flags
    else:
        self._beginOp(flags)
        self.cur_flags = flags
    
    if (flags & (I|T) != (I|T)) and (flags & (I|A) != A):
        # Operation takes no input, only a length
        assert isinstance(data,int)
        # This is because in Python, bytearray(n) returns an array
        # of n zeros.
        #
        # In other languages, you might make a function which
        # takes a length and a nullable pointer, or which is
        # overloaded to take either an integer or a byte array.

    # The actual processing code is just duplex
    cafter    = (flags & (C|I|T)) == (C|T)
    cbefore   = (flags & C) and not cafter
    processed = self._duplex(data, cbefore, cafter)
    
    # Determine what to do with the output.
    if (flags & (I|A)) == (I|A):
        # Return data for the application
        return processed 
        
    elif (flags & (I|T)) == T:
        # Return data for the transport.
        # A fancier implementation might send it directly.
        return processed 
        
    elif (flags & (I|A|T)) == (I|T):
        # Check MAC: all output bytes must be 0
        assert not more # See the side-channel warning
        failures = 0
        for byte in processed: failures |= byte
        if failures != 0: raise AuthenticationFailed()
        return bytearray()

    else:
        # Operation has no output
        return bytearray()

Each of the defined operations from Section 6 is implemented as a call to operate. Note that some operations take an array of data whereas some just take a length. That these two types are handled the same way is an idiom of Python. Other languages would take length as an integer and data as an array of bytes.

def AD(self, data, more=False):
    self.operate(A, data, more)
    
def KEY(self, data, more=False):
    self.operate(A|C, data, more)
    
def send_CLR(self, data, more=False):
    return self.operate(A|T, data, more)
    
def recv_CLR(self, data, more=False):
    return self.operate(I|A|T, data, more)
    
def send_ENC(self, data, more=False):
    return self.operate(A|C|T, data, more)
    
def recv_ENC(self, data, more=False):
    return self.operate(I|A|C|T, data, more)
    
def send_MAC(self, length, more=False):
    return self.operate(C|T, length, more)
    
def recv_MAC(self, data, more=False):
    self.operate(I|C|T, data, more)
    
def PRF(self, length, more=False):
    return self.operate(I|A|C, length, more)
    
def RATCHET(self, length, more=False):
    self.operate(C, length, more)