Piggybacked DTLS Handshakes in SDPRTFM, Inc.ekr@rtfm.com
General
RTCWEB WGInternet-DraftThis document describes a mechanism for embedding DTLS handshake
messages in SDP descriptions. This technique allows implementations
to shave a full round-trip off of DTLS-SRTP session establishment,
while retaining compatibility with ordinary DTLS-SRTP endpoints.DTLS-SRTP uses
a DTLS handshake to establish keys which
are then used to key SRTP . The DTLS
negotiation is tied to the offer/answer
transaction via an “a=fingerprint” attribute
in the SDP . The
common message flow is shown below for DTLS 1.2.This figure and the rest of this document adopt the following
assumptions about network behavior:ICE is in use but that both endpoints
implement endpoint-independent filtering so
that STUN checks succeed immediately.Signaling messages take the same time to be delivered
as direct messages [this is generally false.]Links to detailed diagrams with a more accurate vertical scale can
be found below each diagram.Better pictureIn this flow, the earliest that Alice can start sending media is
after receiving Bob’s Finished and the earliest Bob can start
sending media is upon receiving Alice’s Finished, and neither
side can send any DTLS messages until they have had a successful
STUN check. The result is that in the best case, Alice receives
media four round trips after sending the offer and Bob receives
media three round trips after receiving Alice’s offer.This document describes a technique for improving call setup time by
piggybacking the first round of DTLS messages on the signaling
messages. This reduces latency by a full round trip for both DTLS 1.2
and DTLS 1.3 handshakes, and for DTLS 1.3
allows the answerer to start sending media
immediately upon receiving the offer, or, if ICE is used, upon ICE
completion.The basic concept, as shown in , is for
Alice to send her ClientHello in her Offer and Bob to send
the server’s first flight (ServerHello…ServerHelloDone for DTLS
1.2) in his Answer.Better pictureNote that in this flow, the active/passive (DTLS client/server) roles
are reversed and Alice becomes the client. Because this is a basically
symmetrical transaction, this is not an issue.It should be immediately apparent that this exchange shaves off a full
round trip from Bob’s perspective (despite actually only shaving a
half a round trip from the number of messages). The reason is that Bob
does not need to wait for Alice’s Finished to send but can piggyback
his data on his Finished.This change also shaves off a round trip from Alice’s perspective
because Alice can now safely perform TLS False Start
and send traffic prior to receiving Bob’s
Finished message. When only fingerprints are carried in the handshake,
then extensions such as indicators and DTLS-SRTP
negotiation are not protected. However, in this case because those
indicators are carried in the hello messages which are now tied to the
signaling channel, they are authenticated via the same mechanisms
that authenticate the fingerprint.Note: One could argue that under some conditions Bob could do
False Start in the ordinary handshake, but it’s much harder to
analyze and even then it leaves Alice one round trip slower than
she would be with this optimization.Figure shows the impact of this optimization
on DTLS 1.3.Better pictureAlice cannot send any sooner than with DTLS 1.2
because sending at the point when she receives Bob’s first
message is already optimal. It may be possible
for Bob to shave off yet another
round trip, however. As described in .This document defines a new media-level SDP attribute, “a=dtls-message”.
This message is used to contain DTLS messages. The syntax of this attribute
is:An offeror which wishes to use the optimization defined in this document
shall send his ClientHello in the “a=dtls-message” attribute of its
initial offer with the role “client” and MUST use “a=setup:actpass”. This allows the peer to
either:Reject the optimization, in which case it ignores the attribute.Accept the optimization, in which case it MUST use “a=setup:passive”
and send its first flight (starting with ServerHello) and using
the role “server” in its response. These messages are simply serialized
end-to-end as they would be on the wire. It MAY also choose to
send its first flight separately in the media channel; DTLS implementations
already handle retransmits properly.The offerer MUST be able to detect whether an incoming DTLS message
is a ClientHello or a ServerHello and adapt accordingly.In subsequent negotiations, implementations MUST maintain these
roles.This optimization has a number of interactions with existing pieces of
protocol machinery.When ICE is in use, there is a race condition between the answerer’s
ICE checks (at which point it will be able to send the first flight on
the media channel) and the answerer’s Answer, which contains the first
flight. For this reason, we allow implementations to send the first
flight on both channels. However, as a practical matter it is
reasonably likely that when ICE is in use the Answer will arrive
first, for two reasons:The answerer consumes a full RTT doing a STUN check to verify
the path to the offerer (even in the best case where the
first STUN check succeeds). Thus, even if the path through
the signaling server is twice as expensive as the direct path,
there is a reasonable chance that the answer will arrive first.If the offerer is behind a NAT without endpoint-independent
filtering, the answerer’s ICE checks will be discarded until the
offerer sends its own ICE checks, which it can only do upon receiving
the answer.In this case, although a comparison of and
would show the ClientHello (in ordinary DTLS)
and the ServerHello (when piggybacked) as arriving at the same time,
in fact the ServerHello may arrive up to a full RTT first, but the
offerer can SEND its second flight immediately upon its STUN check
succeeding, which happens first, thus increasing the advantage of this technique.This technique does not interact very well with forking. Because each
ClientHello is only usable for one server, the system must somehow ensure
that only one of the forks takes up the piggybacked offers. The
easiest approach is for any intermediary which does a fork to strip
out the “a=dtls-message” attribute. An alternative would be to add
another attribute which could be stripped out (this might interact
better with RTCWEB Identity). Note that protects against
any SDP modifications, but I think at this point it’s clear that that’s
not practical.RTCWEB Identity assertions need to cover these DTLS messages.[we need examples.]The security implications of this technique are described throughout
this document.This specification defines the “dtls-message” SDP attribute per the
procedures of Section 8.2.4 of . The required information
for the registration is included here:Framework for Establishing a Secure Real-time Transport Protocol (SRTP) Security Context Using Datagram Transport Layer Security (DTLS)This document specifies how to use the Session Initiation Protocol (SIP) to establish a Secure Real-time Transport Protocol (SRTP) security context using the Datagram Transport Layer Security (DTLS) protocol. It describes a mechanism of transporting a fingerprint attribute in the Session Description Protocol (SDP) that identifies the key that will be presented during the DTLS handshake. The key exchange travels along the media path as opposed to the signaling path. The SIP Identity mechanism can be used to protect the integrity of the fingerprint attribute from modification by intermediate proxies. [STANDARDS-TRACK]Datagram Transport Layer Security Version 1.2This document specifies version 1.2 of the Datagram Transport Layer Security (DTLS) protocol. The DTLS protocol provides communications privacy for datagram protocols. The protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery. The DTLS protocol is based on the Transport Layer Security (TLS) protocol and provides equivalent security guarantees. Datagram semantics of the underlying transport are preserved by the DTLS protocol. This document updates DTLS 1.0 to work with TLS version 1.2. [STANDARDS-TRACK]The Secure Real-time Transport Protocol (SRTP)This document describes the Secure Real-time Transport Protocol (SRTP), a profile of the Real-time Transport Protocol (RTP), which can provide confidentiality, message authentication, and replay protection to the RTP traffic and to the control traffic for RTP, the Real-time Transport Control Protocol (RTCP). [STANDARDS-TRACK]An Offer/Answer Model with Session Description Protocol (SDP)This document defines a mechanism by which two entities can make use of the Session Description Protocol (SDP) to arrive at a common view of a multimedia session between them. In the model, one participant offers the other a description of the desired session from their perspective, and the other participant answers with the desired session from their perspective. This offer/answer model is most useful in unicast sessions where information from both participants is needed for the complete view of the session. The offer/answer model is used by protocols like the Session Initiation Protocol (SIP). [STANDARDS-TRACK]Connection-Oriented Media Transport over the Transport Layer Security (TLS) Protocol in the Session Description Protocol (SDP)This document specifies how to establish secure connection-oriented media transport sessions over the Transport Layer Security (TLS) protocol using the Session Description Protocol (SDP). It defines a new SDP protocol identifier, 'TCP/TLS'. It also defines the syntax and semantics for an SDP 'fingerprint' attribute that identifies the certificate that will be presented for the TLS session. This mechanism allows media transport over TLS connections to be established securely, so long as the integrity of session descriptions is assured.This document extends and updates RFC 4145. [STANDARDS-TRACK]SDP: Session Description ProtocolThis memo defines the Session Description Protocol (SDP). SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. [STANDARDS-TRACK]Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer ProtocolsThis document describes a protocol for Network Address Translator (NAT) traversal for UDP-based multimedia sessions established with the offer/answer model. This protocol is called Interactive Connectivity Establishment (ICE). ICE makes use of the Session Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using Relay NAT (TURN). ICE can be used by any protocol utilizing the offer/answer model, such as the Session Initiation Protocol (SIP). [STANDARDS-TRACK]Session Traversal Utilities for NAT (STUN)Session Traversal Utilities for NAT (STUN) is a protocol that serves as a tool for other protocols in dealing with Network Address Translator (NAT) traversal. It can be used by an endpoint to determine the IP address and port allocated to it by a NAT. It can also be used to check connectivity between two endpoints, and as a keep-alive protocol to maintain NAT bindings. STUN works with many existing NATs, and does not require any special behavior from them.STUN is not a NAT traversal solution by itself. Rather, it is a tool to be used in the context of a NAT traversal solution. This is an important change from the previous version of this specification (RFC 3489), which presented STUN as a complete solution.This document obsoletes RFC 3489. [STANDARDS-TRACK]The Transport Layer Security (TLS) Protocol Version 1.3This document specifies Version 1.3 of the Transport Layer Security (TLS) protocol. The TLS protocol allows client/server applications to communicate over the Internet in a way that is designed to prevent eavesdropping, tampering, and message forgery.Transport Layer Security (TLS) False StartThis document specifies an optional behavior of TLS client implementations, dubbed False Start. It affects only protocol timing, not on-the-wire protocol data, and can be implemented unilaterally. A TLS False Start reduces handshake latency to one round trip.Transport Layer Security (TLS) Application-Layer Protocol Negotiation ExtensionThis document describes a Transport Layer Security (TLS) extension for application-layer protocol negotiation within the TLS handshake. For instances in which multiple application protocols are supported on the same TCP or UDP port, this extension allows the application layer to negotiate which protocol will be used within the TLS connection.Enhancements for Authenticated Identity Management in the Session Initiation Protocol (SIP)The existing security mechanisms in the Session Initiation Protocol (SIP) are inadequate for cryptographically assuring the identity of the end users that originate SIP requests, especially in an interdomain context. This document defines a mechanism for securely identifying originators of SIP messages. It does so by defining two new SIP header fields, Identity, for conveying a signature used for validating the identity, and Identity-Info, for conveying a reference to the certificate of the signer. [STANDARDS-TRACK]WARNING: THE FOLLOWING SECTION HAS NOT RECEIVED ANY REAL SECURITY
REVIEW AND MAY BE A REALLY BAD IDEA.It has been observed that as if Alice uses a fresh DH ephemeral, then Bob knows (because he can
trust the signaling service) that Alice’s DH ephemeral corresponds
to Alice and can therefore encrypt under the joint DH shared
secret without waiting for Alice’s CertificateVerify, as shown
in .Better pictureThis has demonstrably inferior security properties if Alice is
using a long-term key (for key continuity or fingerprint validation), because Bob has
not yet verified that Alice controls that key and does not even
know if Alice is using a fresh DH ephemeral, if implementations
decide to adopt this optimization, they must do something hacky like
Send data immediately but generate an error if the handshake,
including a signature, does not complete within some reasonable
period (a small number of measured round trips) [Just one reason
why this is a questionable technique.].Thanks to Cullen Jennings, Martin Thomson, and Justin Uberti for helpful
suggestions.