Ulf Moeller Lance Cottrell Anonymizer Inc. Mixmaster Protocol Version 3 .Abstract Most e-mail security protocols only protect the message body, leaving useful information such as the the identities of the conversing parties, sizes of messages and frequency of message exchange open to adversaries. This document describes Mixmaster, a mail transfer protocol designed to protect electronic mail against traffic analysis. Mixmaster is based on D. Chaum's mix-net protocol. A mix (remailer) is a service that forwards messages, using public key cryptography to hide the correlation between its inputs and outputs. Sending messages through sequences of remailers achieves anonymity and unobserveability of communications against a powerful adversary. .Table of Contents 1. Introduction 2. The Mix-Net Protocol 2.1 Message Creation 2.2 Remailing 2.3 Message Reassembly 3. Message Format 3.1 Payload Format 3.2 Cryptographic Algorithms 3.3 Packet Format 3.4 Mail Transport Encoding 4. Key Format 5. Delivery of Anonymous Messages 6. Security Considerations 7. Acknowledgements 8. References 9. Authors' Addresses 1. Introduction This document describes a mail transfer protocol designed to protect electronic mail against traffic analysis. Most e-mail security protocols only protect the message body, leaving useful information such as the the identities of the conversing parties, sizes of messages and frequency of message exchange open to adversaries. Message transmission can be protected against traffic analysis by the mix-net protocol. A mix (remailer) is a service that forwards messages, using public key cryptography to hide the correlation between its inputs and outputs. If a message is sent through a sequence of mixes, one trusted mix is sufficient to provide anonymity and unobserveability of communications against a powerful adversary. Mixmaster is a mix-net implementation for electronic mail. This memo specifies version 3 of the Mixmaster message format. Version 2, which has been used on the Internet since 1995, is described in a separate document. 2. The Mix-Net Protocol The mix-net protocol [Chaum 1981] allows to send messages while hiding the relation of sender and recipient from observers (unobserveability). It also provides the sender of a message with the ability to remain anonymous to the recipient (sender anonymity) and allows the recipient to receive messages while remaining anonymous to the sender (recipient anonymity). If anonymity is not desired, authenticity and unobserveability can be achieved at the same time by transmitting digitally signed messages. This section gives an overview over the protocol used in Mixmaster. The message format is specified in section 3. 2.1 Message Creation To send a message, the user agent splits it into parts of fixed size, which form the bodies of Mixmaster packets. If sender anonymity is desired, care should be taken not to include identifying information in the message. The message may be compressed. The sender chooses a sequence of remailers for each packet. The final remailer must be identical for all packets. The packet header consists of one header section for each remailer, and is padded with random data to a fixed size. For all remailers i := n down to 1, the sender generates a symmetric encryption key, which is used to encrypt the body and all following header sections. This key, together with other control information for the remailer, is included in the i-th header section, which is then encrypted with the remailer's public key. The resulting message is sent to the first remailer in an appropriate transport encoding. To increase reliability, redundant copies of the message may be sent through different paths. The final remailer must be identical for all paths, so that duplicates can be detected and the message is delivered only once. 2.2 Remailing When a remailer receives a message, it decrypts the first header section with its private key. By keeping track of a packet ID, the remailer verifies that the packet has not been processed before. The integrity of the message is verified by checking the packet length and verifying message digests included in the packet. Then the first header section is removed, the others are shifted up, and the header is filled with pseudo-random padding according to a random seed contained in the message. All header sections and the message body are decrypted with the symmetric key found in the header. This reveals a public key-encrypted header section for the next remailer. Transport encoding is applied to the resulting message. The remailer collects several encrypted messages before sending the resulting messages in random order. Thus the relation between the incoming and outgoing messages is obscured to outside adversaries even if the adversary can observe all messages sent. The message is effectively anonymized by sending it through a chain of independently operated remailers. 2.3 Message Reassembly When a packet is sent to the final remailer, it contains an indication that the chain ends at that remailer, and whether the packet contains a complete message or part of a multi-part message. If the packet contains the entire message, the packet body is decrypted and after reordering messages the plain text is delivered to the recipient. For partial messages, a message ID is used to identify the other parts as they arrive. When all parts have arrived, the message is reassembled, decompressed if necessary, and delivered. If the parts do not arrive within a time limit, the message is discarded. Only the last remailer in the chain can determine whether packets are part of a certain message. To all the others, they are completely independent. 3. Message Format 3.1 Payload Format The Mixmaster message payload can be an e-mail message, a Usenet message or a dummy message. The first line of payload section contains an ASCII string that specifies the message type. The following type strings are defined: mail e-mail message usenet Usenet message zlib zlib-compressed message null dummy message The remainder of e-mail and Usenet messages uses the format specified in [RFC 822] and [RFC 1036] respectively. Remailer operators can choose to remove header fields supplied by the sender and insert additional header fields, according to local policy (see section 5). A compressed message contains another Mixmaster payload section. For the "zlib" message type, the data are compressed using ZLIB [RFC 1950]. A dummy message consists of the type string only. 3.2 Cryptographic Algorithms The asymmetric encryption operation in Mixmaster version 3 is ElGamal [ElGamal 19xx] with OAEP using MGF1 [RFC 2437] at a key size of at least 1024 bits. The symmetric encryption uses EDE 3DES with cipher block chaining (24 byte key, 8 byte initialization vector) [Schneier 1996]. SHA-1 [FIPS 1995] is used as the message digest algorithm. 3.3 Packet Format Mixmaster packets have a fixed size of 20480 bytes in order to ensure that packets are indistinguishable for outside observers. They start with an asymmetrically encrypted session key and a random initialization vector. The remainder of the message is symmetrically encrypted: Length of asymmetrically encrypted data [ 1 byte ] Asymmetrically encrypted session key [ key-dependant] Initialization vector [ 8 bytes] Symmetrically encrypted data [fill to 20 kiB] The symmetrically encrypted part of the packet has the following format: Message digest [ 16 bytes] Packet ID [ 16 bytes] Timestamp [ 2 bytes] Packet type identifier [ 1 byte ] The possible packet type identifiers are: Intermediate hop 0 Final hop 1 Final hop, partial message 2 The remainder of the packet depends on the packet type identifier, as follows: Packet type 0 (intermediate hop): Padding seed value [ 16 bytes] Remailer address size [ 2 bytes] Remailer address [ as spec.] Payload data Packet type 1 (final hop): Payload size [ 2 bytes] Payload data Packet type 2 (final hop, partial message): Message ID [ 16 bytes] Sequence number [ 1 byte ] In the 1st part: Payload size [ 4 bytes] Payload data Message digest: SHA-1 digest computed over all following bytes of the Mixmaster packet. Packet ID: randomly generated packet identifier. Timestamp: A timestamp specifying the number of days since Jan 1, 1970, given in little-endian byte order. A random number of up to 3 may be subtracted from the number of days in order to obscure the origin of the message. Padding seed value: XXX Remailer address: e-mail address of next hop. Payload size: Size of the user data in bytes, given in little-endian byte order. For multi-part messages, the size is given in the first part. Message ID: randomly generated identifier unique to all chunks of this message. Sequence number: enumerates the parts of multi-part messages, starting with 1. Payload data: The message payload (section 3.1) is split into parts of 10240 bytes. A message may consist of up to 255 parts. 3.4 Mail Transport Encoding Mixmaster packets are sent as text messages [RFC 822]. The RFC 822 message body has the following format: :: Remailer-Type: Mixmaster [version number] -----BEGIN REMAILER MESSAGE----- [packet length ] [message digest] [encoded packet] -----END REMAILER MESSAGE----- The length field always contains the decimal number "20480", since the size of Mixmaster packets is constant. An MD5 message digest [RFC 1321] of the (un-encoded) packet is encoded as a hexadecimal string. The packet itself is encoded in base 64 encoding [RFC 1421], broken into lines of 40 characters (except that the last line is shorter). 4. Key Format Remailer public key files consist of a list of attributes and a public RSA key: [attributes list] -----Begin Mix Key----- [key ID] [length] [encoded key] -----End Mix Key----- The attributes are listed in one line separated by spaces: identifier: a human readable string identifying the remailer address: the remailer's Internet mail address key ID: public key ID version: the Mixmaster software version number capabilities: flags indicating additional remailer capabilities The identifier consists of alphanumeric characters, beginning with an alphabetic character. It must not contain whitespace. The encoded key packet consists of two bytes specifying the key length (1024 bits) in little-endian byte order, and of the RSA modulus and the public exponent in big-endian form using 128 bytes each, with preceding null bytes for the exponent if necessary. The packet is encoded in base 64 [RFC 1421], and broken into lines of 40 characters each (except that the last line is shorter). Its length (258 bytes) is given as a decimal number. The key ID is the MD5 message digest of the representation of the RSA public key (not including the length bytes). It is encoded as a hexadecimal string. The capabilities field is optional. It is a list of flags represented by a string of ASCII characters. Clients should ignore unknown flags. The following flags are used in version 2.0.4: C accepts compressed messages. M will forward messages to another mix when used as final hop. Nm supports posting to Usenet through a mail-to-news gateway. Np supports direct posting to Usenet. Digital signatures [RFC 2440] should be used to ensure the authenticity of the key files. 5. Delivery of Anonymous Messages When anonymous messages are forwarded to third parties, remailer operators should be aware that senders might try to supply header fields that indicate a false identity or to send Usenet control messages [RFC 1036] unauthorized, which is a problem because many news servers accept control messages automatically without any authentication. For these reasons, remailer software should allow the operator to disable certain types of message headers, and to insert headers automatically. Remailers usually add a "From:" field containing an address controlled by the remailer operator to anonymous messages. Using the word "Anonymous" in the name field allows recipients to apply scoring mechanisms and filters to anonymous messages. Appropriate additional information about the origin of the message can be inserted in the "Comments:" header field of the anonymous messages. If the recipient does not wish to receive anonymous messages, unobserveability of communications and authenticity can be achieved at the same time by the remailer verifying that the message is cryptographically signed [RFC 2440] by a known sender. Anonymous remailers are sometimes used to send harassing e-mail. To prevent this abuse, remailer software should allow operators to block destination addresses on request. Real-life abuse and attacks on anonymous remailers are discussed in [Mazieres 1998]. 6. Security Considerations The security of the mix-net relies on the assumption that the underlying cryptographic primitives are secure. In addition, specific attacks on the mix-net need to be considered ([Möller 1998] contains a more detailed analysis of these attacks). Passive adversaries can observe some or all of the messages sent to mixes. The users' anonymity comes from the fact that a large number of messages are collected and sent in random order. For that reason remailers should collect as many messages as possible while keeping the delay acceptable. Statistical traffic analysis is possible even if single messages are anonymized in a perfectly secure way: An eavesdropper may correlate the times of Mixmaster packets being sent and anonymized messages being received. This is a powerful attack if several anonymous messages can be linked together (by their contents or because they are sent under a pseudonym). To protect themselves, senders must mail Mixmaster packets stochastically independent of the actual messages they want to send. This can be done by sending packets in regular intervals, using a dummy message whenever appropriate. To avoid leaking information, the intervals should not be smaller than the randomness in the delay caused by trusted remailers. There is no anonymity if all remailers in a given chain collude with the adversary, or if they are compromised during the lifetime of their keys. Using a longer chain increases the assurance that the user's privacy will be preserved, but in the same time causes lower reliability and higher latency. Sending redundant copies of a message increases reliability but may also facilitate attacks. An optimum must be found according to the individual security needs and trust in the remailers. Active adversaries can also create, suppress or modify messages. Remailers must check the packet IDs to prevent replay attacks. Message integrity must be verified to prevent the adversary from performing chosen ciphertext attacks or replay attacks with modified packet IDs, and from encoding information in an intercepted message in a way not affected by decryption (e.g. by modifying the message length or inducing errors). Chosen ciphertext attacks and replay attacks are detected by verifying the message digest included in the header section. The adversary can trace a message if he knows the decryption of all other messages that pass through the remailer at the same time. To make it less practical for an attacker to flood a mix with known messages, remailers can store received messages in a reordering pool that grows in size while more than average messages are received, and periodically choose at random a fixed fraction of the messages in the pool for processing. There is no complete protection against flooding attacks in an open system, but if the number of messages required is high, an attack is less likely to go unnoticed. If the adversary suppresses all Mixmaster messages from one particular sender and observes that anonymous messages of a certain kind are discontinued at the same time, that sender's anonymity is compromised with high probability. There is no practical cryptographic protection against this attack in large-scale networks. The effect of a more powerful attack that combines suppressing messages and re-injecting them at a later time is reduced by using timestamps. The lack of accountability that comes with anonymity may have implications for the security of a network. For example, many news servers accept control messages automatically without any cryptographic authentication. Possible countermeasures are discussed in section 5. 7. Acknowledgements Several people contributed ideas and source code to the Mixmaster v2 software. "Antonomasia" , Adam Back and Bodo Möller suggested improvements to the protocol description. 8. References [Chaum 1981] Chaum, D., "Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms", Communications of the ACM 24 (1981) 2. [ElGamal 19xx] ElGamal, T., ... [FIPS 1995] US Department of Commerce / N.I.S.T., "Secure Hash Standard", Federal Information Processing Standards Publication 180-1, 1995. [Mazieres 1998] Mazières, D., and Kaashoek, F., "The Design, Implementation and Operation of an Email Pseudonym Server", 5th ACM Conference on Computer and Communications Security, 1998. . [Möller 1998] Möller, U., "Anonymisierung von Internet-Diensten", Studienarbeit, University of Hamburg, January 1998. . [RFC 822] Crocker, D., "Standard for the Format of ARPA Internet Text Messages", STD 11, RFC 822, August 1982. [RFC 1036] Horton, M., and Adams, R., "Standard for Interchange of USENET Messages", RFC 1036, December 1987. [RFC 1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April 1992. [RFC 1421] Linn, J., "Privacy Enhancement for Internet Electronic Mail: Part I -- Message Encryption and Authentication Procedures", RFC 1421, February 1993. [RFC 1950] [RFC 2437] Kaliski, B., and Staddon, J., "PKCS #1: RSA Cryptography Specifications, Version 2.0", RFC 2437, October 1998. [RFC 2440] Callas, J., Donnerhacke, L., Finney, H., and Thayer, R.: "OpenPGP Message Format", RFC 2440, November 1998. [Schneier 1996] Schneier, B., "Applied Cryptography", 2nd Edition, Wiley, 1996. 9. Authors' Addresses Ulf Moeller Lance M. Cottrell President, Anonymizer Inc. E-Mail: ulf@fitug.de 8415 La Mesa Blvd., Suite 3 La Mesa, CA 91941 USA E-Mail: loki@infonex.com