# SRT Handshake

Published: 2018-06-28  
Last updated: 2018-06-28

**Contents**

- [Overview](#overview)
- [Short Introduction to SRT Packet Structure](#short-introduction-to-srt-packet-structure)
- [Handshake Structure](#handshake-structure)
- [The "UDT Legacy" and "SRT Extended" Handshakes](#the-udt-legacy-and-srt-extended-handshakes)
  - [UDT Legacy Handshake](#udt-legacy-handshake)
  - [Initiator and Responder](#initiator-and-responder)
  - [The Request Type Field](#the-request-type-field)
  - [The Type Field](#the-type-field)
- [The Caller-Listener Handshake](#the-caller-listener-handshake)
  - [The Induction Phase](#the-induction-phase)
  - [The Conclusion Phase](#the-conclusion-phase)
- [The Rendezvous Handshake](#the-rendezvous-handshake)
  - [HSv4 Rendezvous Process](#hsv4-rendezvous-process)
  - [HSv5 Rendezvous Process](#hsv5-rendezvous-process)
    - [Serial Handshake Flow](#serial-handshake-flow)
    - [Parallel Handshake Flow](#parallel-handshake-flow)
  - [Rendezvous Between Different Versions](#rendezvous-between-different-versions)
- [The SRT Extended Handshake](#the-srt-extended-handshake)
  - [HSv4 Extended Handshake Process](#hsv4-extended-handshake-process)
  - [HSv5 Extended Handshake Process](#hsv5-extended-handshake-process)
  - [SRT Extension Commands](#srt-extension-commands)
    - [HSREQ and HSRSP](#hsreq-and-hsrsp)
    - [KMREQ and KMRSP](#kmreq-and-kmrsp)
    - [Congestion controller](#congestion-controller)
    - [Stream ID (SID)](#stream-id-sid)


## Overview

SRT is a connection protocol, and as such it embraces the concepts of "connection"
and "session". The UDP system protocol is used by SRT for sending data as well as
special control packets, also referred to as "commands".

An SRT connection is characterized by the fact that it is:

- first engaged by a *handshake* process
- maintained as long as any packets are being exchanged in a timely manner
- considered closed when a party receives the appropriate close command from
its peer (connection closed by the foreign host), or when it receives no
packets at all for some predefined time (connection broken on timeout).

Just like its predecessor UDT, SRT supports two connection configurations:

1. **Caller-Listener**, where one side waits for the other to initiate a connection
2. **Rendezvous**, where both sides attempt to initiate a connection

As SRT development has evolved, two handshaking mechanisms have emerged:

1. the **legacy UDT handshake**, with the "SRT" part of the handshake implemented 
as extended control messages; this is the only mechanism in SRT versions 1.2 and 
lower, and is known as **HSv4** (where the number 4 refers to the last UDT 
version)
2. the new **integrated handshake**, known as **HSv5**, where all the required
information concerning the connection is interchanged completely in the
handshake process

The version compatibility requirements are such that if one side of the
connection only understands *HSv4*, the connection is made according to *HSv4*
rules. Otherwise, if both sides are at SRT version 1.3.0 or greater, *HSv5* is
used. As the new handshake supports several features that might be mandatory
for a particular application, it is also possible to reject an HSv4-to-HSv5
connection by setting the `SRTO_MINVERSION` socket option. The value for this
option is an integer with the version encoded in hex. For example:

    int req_version = 0x00010300; // 1.3.0
	srt_setsockflag(s, SRTO_MINVERSION, &req_version, sizeof(int));

**IMPORTANT:** Your SRT application must do either of these two things:

- Be *HSv4* compatible. In this case it must:
   - **NOT** use any new features in 1.3.0 or higher (such as bidirectional 
   transmission or Stream ID)
   - **ALWAYS** set `SRTO_SENDER` to true on the sender side
- Require *HSv5*. If so, it must prevent connections to any older
versions of SRT by setting the minimum version 1.3.0 as shown above.

## Short Introduction to SRT Packet Structure

Every UDP packet carrying SRT traffic contains an SRT header (immediately after 
the UDP header). In all versions, the SRT header contains four major 32-bit fields:

 - `PH_SEQNO`
 - `PH_MSGNO`
 - `PH_TIMESTAMP`
 - `PH_ID`

Their interpretation depends on the type of packet, of which there are two: 
*control packets* and *data packets*, defined by the first bit in the `PH_SEQNO` 
field. 

Here, for example, is a representation of an SRT 1.3.0 **data packet header** 
(where the "packet type" bit = 0):


```
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0|                     Packet Sequence Number                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |FF |O|KK |R|                  Message Number                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Time Stamp                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Destination Socket ID                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
**NOTE:** Packet diagrams in this document are in network bit order.

While a complete description of a data packet is out of scope for this document, 
here is a description of some other header fields unique to SRT:

- **FF** = (2 bits) Position of packet in message, where:
  - 10b = 1st
  - 00b = middle
  - 01b = last
  - 11b = single

- **O** = (1 bit) Indicates whether the message should be delivered in order (1) 
or not (0). In File/Message mode (original UDT with UDT_DGRAM) when this bit is 
clear then a message that is sent later (but reassembled before an earlier message 
which may be incomplete due to packet loss) is allowed to be delivered immediately, 
without waiting for the earlier message to be completed. This is not used in Live 
mode because there's a completely different function used for data extraction 
when TSBPD mode is on.

- **KK** = (2 bits) Indicates whether or not data is encrypted:
  - 00b: not encrypted
  - 01b: encrypted with even key 
  - 10b: encrypted with odd key

- **R** = (1 bit) Retransmitted packet. This flag is clear (0) when a packet is 
transmitted the very first time, and is set (1) if the packet is retransmitted.

In **Data** packets, the third and fourth fields are interpreted as follows:

- `PH_TIMESTAMP`: Usually the time when a packet was sent, although the real
interpretation may vary depending on the type, and it's not important for the
handshake
- `PH_ID`: The **Destination Socket ID** to which a packet should be dispatched, 
 although it may have the special value 0 when the packet is a connection request

Additional details for Data packets will be discussed in the sections below 
covering **extension flags**. 

An SRT control packet header ("packet type" bit = 1) has the following structure:

```
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|          Message Type        |    Message Extended Type     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Additional Data                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                            Time Stamp                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Destination Socket ID                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

For **Control** packets the first two fields are interpreted respectively (using 
network bit order) as:

 - `PH_SEQNO`:
   - Bit 0: packet type (set to 1 for control packet)
   - Bits 1-15: Message Type (see enum `UDTMessageType`)
   - Bits 16-31: Message Extended type
- `PH_MSGNO`: Additional data


The type subfields (in the `PH_SEQNO` field) are used in two ways:

1. The **Message Type** (`SEQNO_MSGTYPE`) is one of the values enumerated as 
`UDTMessageType`, except `UMSG_EXT`. In this case, the type is determined by 
this value only, and the **Message Extended Type** (`SEQNO_EXTTYPE`) value should 
always be 0.
2. The **Message Type** is `UMSG_EXT`. In this case the actual message type is 
contained in the **Message Extended Type**.

The **Extended Message** mechanism is theoretically open for further extensions. 
SRT uses some of them for its own purposes. This will be referred to later in the 
section on the **[SRT Extended Handshake](#the-srt-extended-handshake)**.

The `Additional Data` field (`PH_MSGNO`) is used in some control messages as 
extra space for data. Its interpretation depends on the particular message type. 
Handshake messages don't use it.

[Return to top of page](#srt-handshake)


## Handshake Structure

The handshake portion of a control packet, which comes immediately after the UDT 
header and SRT header, consists of the following 32-bit fields in order:

| Field             | Description                                                                                                                                             |
|:-----------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------|
| `Version`         | Contains number 4 in this version.                                                                                                                      |
| `Type`            | In SRT versions up to 1.2.0 (HSv4) must be the value of `UDT_DGRAM`, which is 2. For usage in later versions of SRT see the "Type field" section below. |
| `ISN`             | Initial Sequence Number; the sequence number for the first data packet                                                                                  |
| `MSS`             | Maximum Segment Size, which is typically 1500, but can be less                                                                                          |
| `FlightFlagSize`  | Maximum number of buffers allowed to be "in flight" (sent and not ACK-ed)                                                                               |
| `ReqType`         | Request type (see below)                                                                                                                                |
| `ID`              | The SOURCE socket ID from which the message is issued (target is in SRT header)                                                                         |
| `Cookie`          | Cookie used for various processing (see below)                                                                                                          |
| `PeerIP`          | Placeholder for the sender's IPv4 or IPv6 IP address, consisting of four 32-bit fields                                                                  |

Here is a representation of the HSv4 handshake structure (which follows immediately 
after the SRT control packet header):

```
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        UDT Version {4}                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Socket Type                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                Initial Packet Sequence Number                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Maximum Packet Size                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Maximum Flow Window Size                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Connection Type                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           Socket ID                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           SYN Cookie                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Peer IP Address                        |
   |                                                               |
   |                                                               |
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```
And here is the equivalent portion of the HSv5 handshake structure (to simplify 
the comparison here, the extended portion of the HSv5 handshake structure is not 
shown. See the [**"UDT Legacy" and "SRT Extended" Handshakes**](#the-udt-legacy-and-srt-extended-handshakes) 
section for details):

```
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        UDT Version {5}                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |       Encryption Flags        |        Extension Flags        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                Initial Packet Sequence Number                 |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                      Maximum Packet Size                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Maximum Flow Window Size                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Connection Type                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           Socket ID                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           SYN Cookie                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Peer IP Address                        |
   |                                                               |
   |                                                               |
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```


The HSv4 (UDT-legacy based) handshake is based on two rules:

1. The complete handshake process, which establishes the connection, is the same
as the UDT handshake.

2. The required SRT data interchange is done **after the connection is established**
using **SRT Extended Message** with the following Extended Types:

    - `SRT_CMD_HSREQ`/`SRT_CMD_HSRSP`, which exchange special SRT flags as well 
    as a latency value
	- `SRT_CMD_KMREQ`/`SRT_CMD_KMRSP` (optional), which exchange the wrapped 
	stream encryption key used with encryption (`KMRSP` is used only for 
	confirmation or error reporting)

**IMPORTANT:** There are two rules in the UDT code that continue to apply to SRT
version 1.2.0 and earlier, and therefore affect the prerequisites for any future
versions of the protocol:

1. The initial handshake response message coming from the Listener side **DOES
NOT REWRITE** the `Version` field (it's simply blindly copied from the
handshake request message received).

2. The size of the handshake message must be **exactly** equal to the legacy UDT
handshake structure, otherwise the message is silently rejected.

As of SRT version 1.3.0 with HSv5 the handshake must only satisfy the minimum
size. However, the code cannot rely on this until each peer is certain about
the SRT version of the other.

Even in HSv5, the **Caller** must first set two fields in the initial handshake
message:
- `Version` = 4
- `Type` = `UDT_DGRAM`

The version recognition relies on the fact that the **Listener** returns a
version of 5 (or potentially higher) if it is capable, but the **Caller** must 
set the `Version` to 4 to make sure that the Listener copies this value, which 
is how an HSv4 client is recognized. This allows SRT to handle the following 
combinations:

1. **HSv5 Caller vs. HSv4 Listener:** The Listener returns version 4 to the Caller,
so the Caller knows it should use HSv4, and then continues the handshake the old way.

2. **HSv4 Caller vs. HSv5 Listener:** The Caller sends version 4 and the Listener
returns version 5. The Caller ignores this value, however, and sends the
second phase of the handshake still using version 4. This is how the Listener
recognizes the HSv4 client.

3. **Both HSv5:** The Listener responds with version 5 (or potentially higher in
future) and the HSv5 Caller recognizes this value as HSv5 (or higher). The Caller
then initiates the second phase of the handshake according to HSv5
rules.

With **Rendezvous** there's no problem because both sides try to
connect to one another, so there's no copying of the handshake data. Each
side crafts its own handshake individually. If the value of the `Version`
field is 5 from the very beginning, and if there are any extension flags set in 
the `Type` field (see note below), the rules of HSv5 apply. But if one party is 
using version 4, the handshake continues as HSv4.

**NOTE**: Previously, the `Type` field contained only the extension flags, but 
now it also contains the encryption flag. So for HSv5 rules to apply the 
extension flag needs to be expressly set.

[Return to top of page](#srt-handshake)


## The "UDT Legacy" and "SRT Extended" Handshakes

### UDT Legacy Handshake

The first versions of SRT did not change anything in the UDT handshake mechanisms, 
which are identified as *HSv4*. Here the connection process is the same as it was 
in UDT, and any extended SRT handshake operations are done after the HSv4 handshake 
is established.

The HSv5 handshake was first introduced in SRT version 1.3.0. It includes all
the extended SRT handshake operations in the overall handshake process (known as 
"integrated handshake"), which means that these data are considered exchanged and 
agreed upon at the moment when the connection is established.

### Initiator and Responder

The addition of a new handshake mechanism necessitates the introduction of two 
new roles: "Initiator" and "Responder":


- **Initiator:** Starts the extended SRT handshake process and sends appropriate
SRT extended handshake requests

- **Responder:** Expects the SRT extended handshake requests to be sent by the
Initiator and sends SRT extended handshake responses back

There are two basic types of SRT handshake extensions that are exchanged in both
handshake versions (HSv5 introduces some more extensions):

- `SRT_CMD_HSREQ`: Exchanges the basic SRT information
- `SRT_CMD_KMREQ`: Exchanges the wrapped stream encryption key (used only if
encryption is requested)

The **Initiator** and **Responder** roles are assigned differently in *HSv4*
and *HSv5*.

For an *HSv4* handshake the assignments are simple:

- **Initiator** is the sender, which is the party that has set the `SRTO_SENDER`
socket option to *true*.
- **Responder** is the receiver, which is the party that has set `SRTO_SENDER` 
to *false* (default).

Note that these roles are independent of the connection mode 
(Caller/Listener/Rendezvous), and that the behavior is undefined if `SRTO_SENDER` 
has the same value on both parties.

For an **HSv5** handshake, the roles are dependent of the connection mode:

- For Caller-Listener connections:
   - the Caller is the **Initiator**
   - the Listener is the **Responder**
   
- For Rendezvous connections:
   - The **Initiator** and **Responder** roles are assigned based on the initial
data interchange during the handshake 
(see [**The Rendezvous Handshake**](#the-rendezvous-handshake) below)

Note that if the handshake can be done as HSv5, the connection is always
considered bidirectional and the `SRTO_SENDER` flag is unused.

[Return to top of page](#srt-handshake)


### The Request Type Field

The `ReqType` field in the **Handshake Structure** (see [above](#handshake-structure)) 
indicates the handshake message type.

**Caller-Listener Request Types:**

1. Caller to Listener: `URQ_INDUCTION`
2. Listener to Caller: `URQ_INDUCTION` (reports cookie)
3. Caller to Listener: `URQ_CONCLUSION` (uses previously returned cookie)
4. Listener to Caller: `URQ_CONCLUSION` (confirms connection established)

**Rendezvous Request Types:**

1. After starting the connection: `URQ_WAVEAHAND`
2. After receiving the above message from the peer: `URQ_CONCLUSION`
3. After receiving the above message from the peer: `URQ_AGREEMENT`.

Note that the **Rendezvous** process is different in HSv4 and HSv5, as the latter 
is based on a state machine.

In case when the connection process has failed when the party was about to
send the `URQ_CONCLUSION` handshake, this field will contain appropriate
error value. This value starts from 1000 (see `UDTRequestType` in `handshake.h`,
since `URQ_FAILURE_TYPES` symbol) added with the value of the rejection
reason (see `SRT_REJECT_REASON` in `srt.h`).

[Return to top of page](#srt-handshake)


### The Type Field

There are two possible interpretations of the `Type` field. The first is the 
legacy UDT "socket type", of which there are two: `UDT_STREAM` and `UDT_DGRAM` 
(in SRT only `UDT_DGRAM` is allowed). This legacy interpretation is applied in
the following circumstances:
 - in an `URQ_INDUCTION` message sent initially by the Caller
 - in an `URQ_INDUCTION` message sent back by the HSv4 Listener 
 - in an `URQ_CONCLUSION` message, if the other party was detected as HSv4
 
For more information on Induction and Conclusion see the 
[Caller-Listener Handshake](#the-caller-listener-handshake) section below.

UDT interpreted the `Type` field as either a **Stream** or **Message** type, 
and rejected the connection if the parties each used a different type. Since SRT 
only uses the **Message** type, HSv5 uses only the `UDT_DGRAM` value for this field 
in cases where the message is going to be sent to an HSv4 party (which follows 
the UDT interpretation).

In all other cases `Type` follows the HSv5 interpretation and consists of the
following:
- an upper 16-bit field (0 - 15) reserved for **encryption flags**
- a lower 16-bit field (16 - 31) reserved for **extension flags**

The **extension flags** field should have the following value:
 - in a `URQ_CONCLUSION` message, it should contain a combination
of extension flags (with the `HS_EXT_` prefix)
 - in a `URQ_INDUCTION` message sent back by the Listener it should contain
`SrtHSRequest::SRT_MAGIC_CODE` (0x4A17)
 - in all other cases it should be 0.

The **encryption flags** currently occupy only 3 out of 16 bits, which are used 
to advertise a value for `PBKEYLEN` (packet based key length). This value is taken
from the `SRTO_PBKEYLEN` option, divided by 8, giving possible values of:
 - 2 (AES-128)
 - 3 (AES-192)
 - 4 (AES-256)
 - 0 (PBKEYLEN not advertised)

The `PBKEYLEN` advertisement is required due to the fact that while the Sender
should decide the `PBKEYLEN`, in HSv5 the Sender might be the Responder. Therefore
`PBKEYLEN` is advertised to the Initiator so that it gets this value before it 
starts creating the SEK on its side, to be then sent to the Responder.

**REMINDER:** Initiator and Responder roles are assigned differently in HSv4 
and HSv5. See the **[Initiator and Responder](#initiator-and-responder)** 
section above.

The specification of `PBKEYLEN` is decided by the Sender. When the transmission 
is bidirectional, this value must be agreed upon at the outset because when both 
are set, the Responder wins. For Caller-Listener connections it is reasonable to 
set this value on the Listener only. In the case of Rendezvous the only reasonable 
approach is to decide upon the correct value from the different sources and to 
set it on both parties (note that **AES-128** is the default).

[Return to top of page](#srt-handshake)


## The Caller-Listener Handshake

This section describes the handshaking process where a Listener is
waiting for an incoming packet on a bound UDP port, which should be an SRT
handshake command (`UMSG_HANDSHAKE`) from a Caller. The process has two phases: 
*induction* and *conclusion*.


### The Induction Phase

The Caller begins by sending an "induction" message, which contains the following 
(significant) fields:

- **Version:** must always be 4
- **Type:** `UDT_DGRAM` (2)
- **ReqType:** `URQ_INDUCTION`
- **ID:** Socket ID of the Caller
- **Cookie:** 0

The **Destination Socket ID** (in the SRT header) in this message is 0, which is
interpreted as a connection request.

**NOTE:** This phase serves only to set a cookie on the Listener so that it 
doesn't allocate resources, thus mitigating a potential DOS attack that might be 
perpetrated by flooding the Listener with handshake commands.

An **HSv4** Listener responds with **exactly the same values**, except:

- **ID:** Socket ID of the HSv4 Listener
- **SYN Cookie:** a cookie that is crafted based on host, port and current time 
with 1 minute accuracy

An **HSv5** Listener responds with the following:

- **Version:** 5
- **Type:** 
  - Extension Field (lower 16 bits): `SrtHSRequest::SRT_MAGIC_CODE` 
  - Encryption Field (upper 16 bits): Advertised `PBKEYLEN`
- **ReqType:** (UDT Connection Type) `URQ_INDUCTION`
- **ID:** Socket ID of the HSv5 Listener
- **SYN Cookie:** a cookie that is crafted based on host, port and current time 
with 1 minute accuracy

**NOTE:** The HSv5 Listener still doesn't know the version of the Caller, and it
responds with the same set of values regardless of whether the Caller is 
version 4 or 5.

The important differences between HSv4 and HSv5 in this respect are:

1. The **HSv4** party completely ignores the values reported in `Version` and
`Type`.  It is, however, interested in the `Cookie` value, as this must be
passed to the next phase. It does interpret these fields, but only in the
"conclusion" message.

2. The **HSv5** party does interpret the values in `Version` and `Type`. If it
receives the value 5 in `Version`, it understands that it comes from an HSv5
party, so it knows that it should prepare the proper HSv5 messages in the next
phase.  It also checks the following in the `Type` field: 

	- whether the lower 16-bit field (extension flags) contains the magic 
value (see the **[Type Field](#the-type-field)** section above); otherwise the 
connection is rejected. This is a contingency for the case where someone who, 
in attempting to extend UDT independently, increases the `Version` value to 5 
and tries to test it against SRT. 

    - whether the upper 16-bit field (encryption flags) contain a non-zero
value, which is interpreted as an advertised `PBKEYLEN` (in which case it is
written into the value of the `SRTO_PBKEYLEN` option).

[Return to top of page](#srt-handshake)


### The Conclusion Phase

Once the Caller gets its cookie, it sends a `URQ_CONCLUSION` handshake
message to the Listener.

The following values are set by an HSv4 Caller. Note that the same values must
be used by an HSv5 Caller when the Listener has returned Version 4 in
its `URQ_INDUCTION` response:

- **Version:** 4
- **Type:** `UDT_DGRAM` (SRT must have this legacy UDT socket type only)
- **ReqType:** `URQ_CONCLUSION`
- **ID:** Socket ID of the Caller
- **Cookie:** the cookie previously received in the induction phase

If an HSv5 Caller receives a confirmation from a Listener that it can use the 
version 5 handshake, it fills in the following values:

- **Version:** 5
- **Type:** appropriate Extension Flags and Encryption Flags (see below)
- **ReqType:** `URQ_CONCLUSION`
- **ID:** Socket ID of the Caller 
- **Cookie:** the cookie previously received in the induction phase

The Destination Socket ID (in the SRT header, `PH_ID` field) in this message is the
socket ID that was previously received in the induction phase in the `ID` field
in the handshake structure.

The **Type** field contains:

- **Encryption Flags:** advertised `PBKEYLEN` (see above)
- **Extension Flags:** The `HS_EXT_` prefixed flags defined in `CHandShake` - see the
  **[SRT Extended Handshake](#the-srt-extended-handshake)** section below.

The Listener responds with the same values shown above, without the cookie (which 
isn't needed here), as well as the extensions for HSv5 (which will probably be 
exactly the same).

**IMPORTANT:** There isn't any "negotiation" here. If the values passed in the 
handshake are in any way not acceptable by the other side, the connection will 
be rejected. The only case when the Listener can have precedence over the Caller 
is the advertised `PBKEYLEN` in the `Encryption Flags` field in `Type` field. 
The value for latency is always agreed to be the greater of those reported 
by each party.

[Return to top of page](#srt-handshake)


## The Rendezvous Handshake

When two parties attempt to connect in **Rendezvous** mode, they are considered 
to be  equivalent: Both are connecting, but neither is listening, and they expect 
to be  contacted (over the same port number for both parties) specifically by 
the same party with which they are trying to connect. Therefore, it's perfectly 
safe to assume that, at  some point, each party will have agreed upon the 
connection, and that no induction-conclusion phase split is required. Even so, 
the Rendezvous handshake process is more complicated.

The basics of a Rendezvous handshake are the same in HSv4 and HSv5 - the 
description of the HSv4 process is a good introduction for HSv5. However, HSv5
has more data to exchange and more conditions to be taken into account. 

[Return to top of page](#srt-handshake)


### HSv4 Rendezvous Process

Initially, each party sends an SRT control message of type `UMSG_HANDSHAKE` to 
the other, with the following fields:

- **Version:** 4 (HSv4 only)
- **Type:** `UDT_DGRAM` (HSv4 only)
- **ReqType:** `URQ_WAVEAHAND`
- **ID:** Socket ID of the party sending this message
- **Cookie:** 0

When the `srt_connect()` function is first called by an application, each party
sends this message to its peer, and then tries to read a packet from its
underlying UDP socket to see if the other party is alive. Upon reception of an
`UMSG_HANDSHAKE` message, each party initiates the second (conclusion) phase by
sending this message:

- **Version:** 4
- **Type:** `UDT_DGRAM`
- **ReqType:** `URQ_CONCLUSION`
- **ID:** Socket ID of the party sending this message
- **Cookie:** 0

At this point, they are considered to be connected. When either party receives 
this message from its peer again, it sends another message with the `ReqType`
field set as `URQ_AGREEMENT`. This is a formal conclusion to the handshake
process, required to inform the peer that it can stop sending conclusion
messages (note that this is UDP, so neither party can assume that the message
has reached its peer).

With HSv4 there's no debate about who is the Initiator and who is the Responder 
because this transaction is unidirectional, so the party that has set the 
`SRTO_SENDER` flag is the Initiator and the other is Responder (as is usual 
with HSv4).

[Return to top of page](#srt-handshake)


### HSv5 Rendezvous Process

The HSv5 Rendezvous process introduces a state machine, and therefore is slightly 
different from HSv4, although it is still based on the same message request types. 
Both parties start with `URQ_WAVEAHAND` and use a `Version` value of 5. The 
version recognition is easy - the HSv4 client does not look at the `Version` value, 
whereas HSv5 clients can quickly recognize the version from the `Version` field.
The parties only continue with the HSv5 Rendezvous process when `Version` = 5
for both. Otherwise the process continues exclusively according to *HSv4* rules.

With HSv5 Rendezvous, both parties create a cookie for a process called a 
"cookie contest". This is necessary for the assignment of Initiator and Responder 
roles. Each party generates a cookie value (a 32-bit number) based on the host, 
port, and current time with 1 minute accuracy. This value is scrambled using
an MD5 sum calculation. The cookie values are then compared with one another.

Since you can't have two sockets on the same machine bound to the same device
and port and operating independently, it's virtually impossible that the
parties will generate identical cookies. However, this situation may occur if an 
application tries to "connect to itself" - that is, either connects to a local 
IP address, when the socket is bound to INADDR_ANY, or to the same IP address to 
which the socket was bound. If the cookies are identical (for any reason), the 
connection will not be made until new, unique cookies are generated (after a 
delay of up to one minute). In the case of an application "connecting to itself", 
the cookies will always be identical, and so the connection will never be made.

When one party's cookie value is greater than its peer's, it wins the cookie 
contest and becomes Initiator (the other party becomes the Responder).

At this point there are two "handshake flows" possible (at least theoretically):
*serial*  and *parallel*. 

#### Serial Handshake Flow

In the **serial** handshake flow, one party is always first, and the other follows.
That is, while both parties are repeatedly sending `URQ_WAVEAHAND` messages, at
some point one party - let's say Alice - will find she has received a 
`URQ_WAVEAHAND` message before she can send her next one, so she sends a 
`URQ_CONCLUSION` message in response. Meantime, Bob (Alice's peer) has missed 
her `URQ_WAVEAHAND` messages, and so Alice's `URQ_CONCLUSION` is the first message 
Bob has received from her.

This process can be described easily as a series of exchanges between the first 
and following parties (Alice and Bob, respectively):

1. Initially, both parties are in the *waving* state. Alice sends a handshake
message to Bob:
	- **Version:** 5
	- **Type:** Extension field: 0, Encryption field: advertised `PBKEYLEN`.
	- **ReqType:** `URQ_WAVEAHAND`
	- **ID:** Alice's socket ID
	- **Cookie:** Created based on host/port and current time

Keep in mind that while Alice doesn't yet know if she is sending this message to 
an HSv4 or HSv5 peer, the values from these fields would not be interpreted by 
an HSv4 peer when the **ReqType** is `URQ_WAVEAHAND`.

2. Bob receives Alice's `URQ_WAVEAHAND` message, switches to the *attention* 
state. Since Bob now knows Alice's cookie, he performs a "cookie contest" 
(compares both cookie values). If Bob's cookie is greater than Alice's, he will 
become the **Initiator**. Otherwise, he will become the **Responder**.

**IMPORTANT**: The resolution of the [Handshake Role](#initiator-and-responder) 
(Initiator or Responder) is essential to further processing. 

Then Bob responds:

- **Version:** 5
- **Type:**
   - *Extension field:* appropriate flags if Initiator, otherwise 0
   - *Encryption field:* advertised `PBKEYLEN`
- **ReqType:** `URQ_CONCLUSION`    

**NOTE:** If Bob is the Initiator and encryption is on, he will use either his
own `PBKEYLEN` or the one received from Alice (if she has advertised
`PBKEYLEN`).
	
3. Alice receives Bob's `URQ_CONCLUSION` message. While at this point she also 
performs the "cookie contest", the outcome will be the same. She switches to the 
*fine* state, and sends:

	- **Version:** 5
	- **Type:** Appropriate extension flags and encryption flags
	- **ReqType:** `URQ_CONCLUSION`

**NOTE:** Both parties always send extension flags at this point, which will
contain `SRT_CMD_HSREQ` if the message comes from an Initiator, or
`SRT_CMD_HSRSP` if it comes from a Responder. If the Initiator has received a
previous message from the Responder containing an advertised `PBKEYLEN` in the
encryption flags field (in the `Type` field), it will be used as the key length
for key generation sent next in the `SRT_CMD_KMREQ` block.

4. Bob receives Alice's `URQ_CONCLUSION` message, and then does one of the 
following (depending on Bob's role):

	- If Bob is the Initiator (Alice's message contains `SRT_CMD_HSRSP`), he:
		- switches to the *connected* state
		- sends Alice a message with `ReqType` = `URQ_AGREEMENT`, but containing 
		  no SRT extensions (*Extension flags* in `Type` should be 0)
		  
	- If Bob is the Responder (Alice's message contains `SRT_CMD_HSREQ`), he:
		- switches to *initiated* state
		- sends Alice a message with ReqType = `URQ_CONCLUSION` that also contains
		  extensions with `SRT_CMD_HSRSP`
        - awaits a confirmation from Alice that she is also connected (preferably 
          by `URQ_AGREEMENT` message)

5. Alice receives the above message, enters into the *connected* state, and 
then does one of the following (depending on Alice's role):

    - If Alice is the Initiator (received `URQ_CONCLUSION` with `SRT_CMD_HSRSP`), 
    she sends Bob a message with `ReqType` = `URQ_AGREEMENT`. 

    - If Alice is the Responder, the received message has `ReqType` = `URQ_AGREEMENT` 
    and in response she does nothing. 

6. At this point, if Bob was Initiator, he is connected already. If he was a 
Responder, he should receive the above `URQ_AGREEMENT` message, after which he
switches to the *connected* state. In the case where the UDP packet with the 
agreement message gets lost, Bob will still enter the *connected* state once
he receives anything else from Alice. If Bob is going to send, however, he
has to continue sending the same `URQ_CONCLUSION` until he gets the confirmation
from Alice.

[Return to top of page](#srt-handshake)


#### Parallel Handshake Flow

The serial handshake flow described above happens in almost every case.

There is, however, a very rare (but still possible) **parallel** flow that only 
occurs if the messages with `URQ_WAVEAHAND` are sent and received by both peers 
at precisely the same time. This *might* happen in one of these situations:

- if both Alice and Bob start sending `URQ_WAVEAHAND` messages perfectly 
simultaneously,  
or 
- if Bob starts later but sends his `URQ_WAVEAHAND` message during the 
gap between the moment when Alice had earlier sent her message,
and the moment when that message is received (that is, if each party receives the 
message from its peer immediately after having sent its own),  
or
- if, at the beginning of `srt_connect`, Alice receives the first message from 
Bob exactly during the very short gap between the time Alice is adding a socket 
to the connector list and when she sends her first `URQ_WAVEAHAND` message

The resulting flow is very much like Bob's behaviour in the serial handshake flow, 
but for both parties. Alice and Bob will go through the same state transitions:

    Waving -> Attention -> Initiated -> Connected

In the *Attention* state they know each other's cookies, so they can assign
roles. It is important to understand that, in contrast to serial flows,
which are mostly based on request-response cycles, here everything
happens completely asynchronously: the state switches upon reception
of a particular handshake message with appropriate contents (the
Initiator must attach the `HSREQ` extension, and Responder must attach the 
`HSRSP` extension). 

Here's how the parallel handshake flow works, based on roles:

**Initiator:**

1. `Waving`
   - Receives `URQ_WAVEAHAND` message 
   - Switches to `Attention`
   - Sends `URQ_CONCLUSION` + `HSREQ`
2. `Attention`
   - Receives `URQ_CONCLUSION` message, which:
     - contains no extensions:
       - switches to `Initiated`, still sends `URQ_CONCLUSION` + `HSREQ`
     - contains `HSRSP` extension: 
       - switches to `Connected`, sends `URQ_AGREEMENT`
3. `Initiated`
   - Receives `URQ_CONCLUSION` message, which:
     - Contains no extensions: 
       - REMAINS IN THIS STATE, still sends `URQ_CONCLUSION` + `HSREQ`
     - contains `HSRSP` extension: 
       - switches to `Connected`, sends `URQ_AGREEMENT`
4. `Connected`
   - May receive `URQ_CONCLUSION` and respond with `URQ_AGREEMENT`, but normally 
by now it should already have received payload packets.

**Responder:**

1. `Waving`
   - Receives `URQ_WAVEAHAND` message
   - Switches to `Attention`
   - Sends `URQ_CONCLUSION` message (with no extensions)
2. `Attention`
   - Receives `URQ_CONCLUSION` message with `HSREQ`  
     **NOTE:** This message might contain no extensions, in which case the party 
     shall simply send the empty `URQ_CONCLUSION` message, as before, and remain 
     in this state. 
   - Switches to `Initiated` and sends `URQ_CONCLUSION` message with `HSRSP`
3. `Initiated`
   - Receives:
     - `URQ_CONCLUSION` message with `HSREQ` 
       - responds with `URQ_CONCLUSION` with `HSRSP` and remains in this state
     - `URQ_AGREEMENT` message 
       - responds with `URQ_AGREEMENT` and switches to `Connected`
     - Payload packet 
       - responds with `URQ_AGREEMENT` and switches to `Connected`
4. `Connected`
    - Is not expecting to receive any handshake messages anymore. The
`URQ_AGREEMENT` message is always sent only once or per every final 
`URQ_CONCLUSION`message.

Note that any of these packets may be missing, and the sending party will
never become aware. The missing packet problem is resolved this way:

1. If the Responder misses the `URQ_CONCLUSION` + `HSREQ` message, it simply 
continues sending empty `URQ_CONCLUSION` messages. Only upon reception of 
`URQ_CONCLUSION` + `HSREQ` does it respond with `URQ_CONCLUSION` + `HSRSP`.

2. If the Initiator misses the `URQ_CONCLUSION` + `HSRSP` response from the 
Responder, it continues sending `URQ_CONCLUSION` + `HSREQ`. The Responder must 
always respond with `URQ_CONCLUSION` + `HSRSP` when the Initiator sends 
`URQ_CONCLUSION` + `HSREQ`, even if it has already received and interpreted it.

3. When the Initiator switches to the `Connected` state it responds with a
`URQ_AGREEMENT` message, which may be missed by the Responder. Nonetheless, the 
Initiator may start sending data packets because it considers itself connected 
- it doesn't know that the Responder has not yet switched to the `Connected` state. 
Therefore it is exceptionally allowed that when the Responder is in the `Initiated`
state and receives a data packet (or any control packet that is normally sent only 
between connected parties) over this connection, it may switch to the `Connected`
state just as if it had received a `URQ_AGREEMENT` message.

4. If the the Initiator is already switched to the `Connected` state it will not 
bother the Responder with any more handshake messages. But the Responder may be 
completely unaware of that (having missed the `URQ_AGREEMENT` message from the 
Initiator). Therefore it doesn't exit the connecting state (still blocks on 
`srt_connect` or doesn't signal connection readiness), which means that it 
continues sending `URQ_CONCLUSION` + `HSRSP` messages until it receives any 
packet that will make it switch to the `Connected` state (normally 
`URQ_AGREEMENT`). Only then does it exit the connecting state and the 
application can start transmission.

[Return to top of page](#srt-handshake)


### Rendezvous Between Different Versions

When one of the parties in a handshake supports HSv5 and the other only
HSv4, the handshake is conducted according to the rules described in the 
**[HSv4 Rendezvous Process](#hsv4-rendezvous-process)** section above.

Note, though, that in the first phase the `URQ_WAVEAHAND` request type sent
by the HSv5 party contains the `m_iVersion` and `m_iType` fields filled in as
required for version 5. This happens only for the "waving" phase, and fortunately
HSv4 clients ignore these fields. When switching to the conclusion phase, the
HSv5 client is already aware that the peer is HSv4 and fills the fields of the
conclusion handshake message according to the rules of HSv4.

[Return to top of page](#srt-handshake)


## The SRT Extended Handshake

### HSv4 Extended Handshake Process

The HSv4 extended handshake process starts **after the connection is considered
established**. Whatever problems may occur after this point *will only affect 
data transmission*.

Here is a representation of the HSv4 extended handshake packet structure 
(including the first four 32-bit segments of the SRT header):

```
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |1|         Type=0x7fff         |    Ext {HSREQ(1),HSRSP(2)}    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                  Additional Info = undefined                  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       Time Stamp (µsec)                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     Destination Socket ID                     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                     SRT Version {<10300h}                     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           SRT Flags                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |        TsbPd Resv = 0         |     TsbPdDelay {20..8000}     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                          Reserved = 0                         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

The HSv4 extended handshake is performed with the use of the [aforementioned](#overview) 
"SRT Extended Messages", using control messages with major type `UMSG_EXT`.

Note that these command messages, although sent over an established connection, 
are still simply UDP packets. As such they are subject to all the problematic 
UDP protocol phenomena, such as packet loss (packet recovery applies exclusively 
to the payload packets). Therefore messages are sent "stubbornly" (with a
slight delay between subsequent retries) until the peer responds, with some
maximum number of retries before giving up. It's very important to understand
that the first message from an Initiator is sent at the same moment when the
application requests transmission of the first data packet. This data packet is
**not** held back until the extended SRT handshake is finished. The first
command message is sent, followed by the first data packet, and the rest of the
transmission continues without having the extended SRT handshake yet agreed
upon.

This means that the initial few data packets might be sent without having the 
appropriate SRT settings already working, which may raise two concerns:

- *There is a delay in the application of latency to received packets* - At first, 
packets are being delivered immediately. It is only when the `SRT_CMD_HSREQ` 
message is processed that latency is applied to the received packets. The 
time stamp based packet delivery mechanism (TSBPD) isn't working until then.

- *There is a delay in the application of encryption (if used) to received 
packets* - Packets can't be decrypted until the `SRT_CMD_KMREQ` is processed and 
the keys installed. The data packets are still encrypted, but the receiver can't
decrypt them and will drop them.

The codes for commands used are the same in HSv4 and HSv5 processes. In
HSv4 these are minor message type codes used with the `UMSG_EXT` command,
whereas in HSv5 they are in the "command" part of the extension block. The
messages that are sent as "REQ" parts will be repeatedly sent until they get
a corresponding "RSP" part, up to some timeout, after which they give up and
stay with a pure UDT connection.

[Return to top of page](#srt-handshake)


### HSv5 Extended Handshake Process

Here is a representation of the HSv5 **integrated** handshake packet structure
(without SRT header):

```
   0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  ---
   |                        UDT Version {5}                        |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |       Encryption Flags        |        Extension Flags        |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |                Initial Packet Sequence Number                 |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |                      Maximum Packet Size                      |   H
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   A
   |                    Maximum Flow Window Size                   |   N
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   D
   |                        Connection Type                        |   S
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   H
   |                           Socket ID                           |   A
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   K
   |                           SYN Cookie                          |   E
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |                        Peer IP Address                        |   |
   |                                                               |   |
   |                                                               |   |
   |                                                               |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  ---
   |   Ext Type=SRT_CMD_HSREQ(1)   |         Ext Size {3}          |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   H
   |                    SRT Version {>=10300h}                     |   S
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   R
   |                           SRT Flags                           |   E
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   Q
   |     RcvTsbPdDelay {20..8000}  |   SndTsbPdDelay {20..8000}    |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  ---
   |    Ext Type=SRT_CMD_KMREQ(3)  |      Ext Size (bytes/4)       |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |0| V{1} PT{2}|          Sign {2029h}           |  Resv {0}  |KK|   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |                           KEKI {0}                            |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |  Cipher {2} |    Auth {0}     |     SE {2}    |   Resv1 {0}   |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |           Recv2 {0}           | Slen(bytes)/4 | klen(bytes)/4 |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |
   |                           Salt[Slen]                          |   |
   |                                                               |   |
   |                                                               |   K
   |                                                               |   M
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   R
   |                   Wrap[((KK+1/2)*Klen) + 8]                   |   E
   |                                                               |   Q
   |                                                               |   |
   |                                                               |   |
   |                                                               |   |
   |                                                               |   |
   |                                                               |   |
   |                                                               |   |
   |                                                               |   |
   |                                                               |   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+  ---
```

The **Extension Flags** subfield in the `Type` field in a conclusion handshake
message contains one of these flags:

- `HS_EXT_HSREQ`: defines SRT characteristic data; always present
- `HS_EXT_KMREQ`: if using encryption, defines encryption block
- `HS_EXT_CONFIG`: informs about having extra configuration data attached

The above schema shows the HSv5 packet structure, which can be split into
three parts:

1. The Handshake data part (up to "Peer IP Address" field)
2. The HSREQ extension
3. The KMREQ extension

Note that extensions are added only in certain situations (as described
above), so sometimes there are no extensions at all. When extensions are added,
the HSREQ extension is always present. The KMREQ extension is added only if
encryption is requested (the passphrase is set by the `SRTO_PASSPHRASE` socket
option). There might be also other extensions placed after HSREQ and KMREQ.

Every extension block has the following structure:

(1) a 16-bit command symbol  
(2) 16-bit block size (number of 32-bit words following this field)  
(3) a number of 32-bit fields, as specified in (2) above

What is contained in a block depends on the extension command code.

The data being received in the extension blocks in the conclusion message
undergo further verification. If the values are not acceptable, the
connection will be rejected. This may happen in the following situations:

1. The `Version` field contains 0. This means that the peer rejected the
handshake.

2. The `Version` field was higher than 4, but no extensions were added (no
extension flags set), while the rules state that they should be present. This is
considered an error in the case of a `URQ_CONCLUSION` message sent by the
Initiator to the Responder (there can be an initial conclusion message without
extensions sent by the Responder to the Initiator in Rendezvous connections).

3. Processing of any of the extension data has failed (also due to an internal
error).

4. Each side declares a transmission type that is not compatible with the
other. This will be described further, along with other new HSv5 features;
the HSv4 client supports only and exclusively one transmission type, which
is *Live*. This is indicated in the `Type` field in the HSv4 handshake, which 
must be equal to `UDT_DGRAM` (2), and in the HSv5 by the extra *Smoother* 
block declaration (see below). In any case, when there's no *Smoother*
declared, *Live* is assumed. Otherwise the Smoother type must be exactly
the same on both sides.

**NOTE:** The `TsbPd Resv` and `TsbPdDelay` fields both refer to latency,
but the use is different in HSv4 and HSv5.

In HSv4, only the lower 16 bits (`TsbPdDelay`) are used. The upper 16 bits
(`TsbPd Resv`) are simply unused. There's only one direction, so `HSREQ` is
sent by the Sender, `HSRSP` by the Receiver.  `HSREQ` contains only the Sender
latency, and `HSRSP` contains only the Receiver latency.

This is different from HSv5, in which the latency value for the sending
direction in the lower 16 bits (`SndTsbPdDelay`, 16 - 31 in network order) and
for receiving direction is placed in the upper 16 bits (`RcvTsbpdDelay`, 0 -
15). The communication is bidirectional, so there are two latency values, one
per direction. Therefore both HSREQ and HSREQ messages contain both the Sender
and Receiver latency values.

[Return to top of page](#srt-handshake)


### SRT Extension Commands

#### HSREQ and HSRSP

The `SRT_CMD_HSREQ` message contains three 32-bit fields designated as:

- `SRT_HS_VERSION`: string (0x00XXYYZZ) representing SRT version XX.YY.ZZ
- `SRT_HS_FLAGS`: the SRT flags (see below)
- `SRT_HS_LATENCY`: the latency specification

```
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    SRT Version {>=10300h}                     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           SRT Flags                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |(HSv4)  TsbPd Resv = 0         |     TsbPdDelay {20..8000}     |
   |(HSv5) RcvTsbPdDelay {20..8000}|   SndTsbPdDelay {20..8000}    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

The flags (`SRT Flags` field) are the following bits, in order:

(0) `SRT_OPT_TSBPDSND`: The party will be sending in TSBPD (Time Stamp Based 
Packet Delivery) mode.

This is used by the Sender party to specify that it will use TSBPD mode.
The Responder should respond with its setting for TSBPD reception; if it isn't 
using TSBPD for reception, it responds with its reception TSBPD flag not set. 
In HSv4, this is only used by the Initiator.

(1) `SRT_OPT_TSBPDRCV`: The party expects to receive in TSBPD mode.

This is used by a party to specify that it expects to receive in TSBPD mode.
The Responder should respond to this setting with TSBPD sending
mode (HSv5 only) and set the sending TSBPD flag appropriately. In HSv4 this is 
only used by the Responder party.

(2) `SRT_OPT_HAICRYPT`: The party includes `haicrypt` (legacy flag).

This **special legacy compatibility flag** should be always set. See below 
for more details.

(3) `SRT_OPT_TLPKTDROP`: The party will do TLPKTDROP.

Declares the `SRTO_TLPKTDROP` flag of the party. This is important
because both parties must cooperate in this process. In HSv5, if both
directions are TSBPD, both use this setting. While it is not always
necessary to set this flag in live mode, it is the default and most
recommended setting.

(4) `SRT_OPT_NAKREPORT`: The party will do periodic NAK reporting.

Declares the `SRTO_NAKREPORT` flag of the party. This flag means
that periodic NAK reports will be sent (repeated `UMSG_LOSSREPORT`
message when the sender seems to linger with retransmission).

(5) `SRT_OPT_REXMITFLG`: The party uses the REXMIT flag.

This **special legacy compatibility flag** should be always set. See below 
for more details.

(6) `SRT_OPT_STREAM`: The party uses stream type transmission.

This is introduced in HSv5 only. When set, the party is using a stream
type transmission (file transmission with no boundaries). In HSv4 this
flag does not exist, and therefore it's always clear, which corresponds
to the fact that HSv4 supports Live mode only.

**Special Legacy Compatibility Flags**

The `SRT_OPT_HAICRYPT` and `SRT_OPT_REXMITFLG` fields define special cases for 
the interpretation of the contents in the SRT header for payload packets.

The SRT header contains an unusual field designated as `PH_MSGNO`,
which contains first some extra flags that occupy the most significant
bits in this field (the rest are assigned to the Message Number).
Some of these extra flags were already in UDT, but SRT added some
more by stealing bits from the Message Number subfield:

1. **Encryption Key** flags (2 bits). Controlled by `SRT_OPT_HAICRYPT`,
this field contains a value that declares whether the payload is
encrypted and with which key.

2. **Retransmission** flag (1 bit). Controlled by `SRT_OPT_REXMITFLG`, this flag 
is 0 when a packet is sent the first time, and 1 when it is retransmitted (i.e. 
requested in a loss report). When the incoming packet is late (one with a
sequence number older than the newest received so far), this flag allows the
Receiver to distinguish between a retransmitted packet and a reordered packet.
This is used by the "reorder tolerance" feature described in the API
documentation under `SRTO_LOSSMAXTTL` socket option.

As of version 1.2.0 both these fields are in use, and therefore both these
flags must always be set. In theory, there might still exist some SRT versions 
older than 1.2.0 where these flags are not used, and these extra bits remain 
part of the "Message Number" subfield.

In practice there are no versions around that do not use encryption bits, 
although there might be some old SRT versions still in use that do not include 
the Retransmission field, which was introduced in version 1.2.0. In practice 
both these flags must be set in the version that has them defined. They might 
be reused in future for something else, once all versions below 1.2.0 are 
decommissioned, but the default is for them to be set.

The `SRT_HS_LATENCY` field defines Sender/Receiver latency.

It is split into two 16-bit parts. The usage differs in HSv4 and HSv5.

In **HSv4** only the lower part (bits 16 - 31) is used. The upper part (bits 0 - 15) 
is always 0. The interpretation of this field is as follows:
 - Receiver party: Receiver latency
 - Sender party: Sender latency

In **HSv5** both 16-bit parts of the field are used, and interpreted 
as follows::
 - Upper 16 bits (0 - 15): Receiver latency
 - Lower 16 bits (16 - 31): Sender latency

The characteristics of Sender and Receiver latency are the following:

1. **Sender latency** is the minimum latency that the Sender wants the
Receiver to use.

2. **Receiver latency** is the (minimum) value that the Receiver
wishes to apply to the stream that it will be receiving.

Once these values are exchanged via the extended handshake, an **effective 
latency** is established, which is always the maximum of the two. Note that 
latency is defined in a specified direction. In HSv5, a connection is 
bidirectional, and a separate latency is defined for each direction.

The Initiator sends an `HSREQ` message, which declares the values on its side.
The Responder calculates the maximum values between what it receives in the 
`HSREQ`and its own values, then sends an `HSRSP` with the effective latencies.

Here is an example of an **HSv5 bidirectional transmission** between Alice and 
Bob, where Alice is Initiator:

1. Alice and Bob set the following latency values:

   - Alice: `SRTO_PEERLATENCY` = 250 ms, `SRTO_RCVLATENCY` = 550 ms
   - Bob: `SRTO_PEERLATENCY` = 500 ms, `SRTO_RCVLATENCY` = 300 ms

2. Alice defines the latency field in the HSREQ message:
```
    hs[SRT_HS_LATENCY] = { 250, 550 }; // { Lower, Upper }
```
3. Bob receives it, sets his options, and responds with `HSRSP`:
```
    SRTO_RCVLATENCY = max(300, 250);  //<-- 250:Alice's PEERLATENCY
    SRTO_PEERLATENCY = max(500, 550); //<-- 550:Alice's RCVLATENCY
    hs[SRT_HS_LATENCY] = { 550, 300 };
```
4. Alice receives this `HSRSP` and sets:
```
    SRTO_RCVLATENCY = 550;
    SRTO_PEERLATENCY = 300;
```
We now have the **effective latency** values:

   - For transmissions from Alice to Bob: 300ms
   - For transmissions from Bob to Alice: 550ms

Here is an example of an *HSv4* exchange, which is simpler because there's only 
one direction. We'll refer to Alice to Bob again to be consistent with the 
Initiator/Responder roles in the HSv5 example:

1. Alice sets `SRTO_LATENCY` to 250 ms

2. Bob sets `SRTO_LATENCY` to 300 ms

3. Alice sends `hs[SRT_HS_LATENCY] = { 250, 0 };` to Bob

4. Bob does `SRTO_LATENCY = max(300, 250);`

5. Bob sends `hs[SRT_HS_LATENCY] = {300, 0};` to Alice

6. Alice sets `SRTO_LATENCY` to 300

Note that the `SRTO_LATENCY` option in HSv5 sets both `SRTO_RCVLATENCY` and
`SRTO_PEERLATENCY` to the same value, although when reading, `SRTO_LATENCY`
is an alias to `SRTO_RCVLATENCY`.

Why is the Sender latency updated to the effective latency for that direction?
Because the `TLPKTDROP` mechanism, which is used by default in Live mode, may 
cause the Sender to decide to stop retransmitting packets that are known to be 
too late to retransmit. This latency value is one of the factors taken into 
account to calculate the time threshold for `TLPKTDROP`.

[Return to top of page](#srt-handshake)


#### KMREQ and KMRSP

`KMREQ` and `KMRSP` contain the KMX (key material exchange) message used for
encryption. The most important part of this message is the
AES-wrapped key (see the [Encryption documentation](encryption.md) for
details). If the encryption process on the Responder side was successful,
the response contains the same message for confirmation. Otherwise it's
one single 32-bit value that contains the value of `SRT_KMSTATE` type,
as an error status.

Note that when the encryption settings are different at each end, then
the connection is still allowed, but with the following restrictions:

- If the Initiator declares encryption, but the Responder does not, then
the Responder responds with `SRT_KM_S_NOSECRET` status. This means
that the Responder will not be able to decrypt data sent by the Initiator,
but the Responder can still send unencrypted data to the Initiator.

- If the Initiator did not declare encryption, but the Responder did, then
the Responder will attach `SRT_CMD_KMRSP` (despite the fact that the Initiator 
did not send `SRT_CMD_KMREQ`) with `SRT_KM_S_UNSECURED` status. The
Responder won't be able to send data to the Initiator (more precisely,
it will send scrambled data, not able to be decrypted), but the Initiator
will still be able to send unencrypted data to the Responder.

- If both have declared encryption, but have set different passwords,
the Responder will send a `KMRSP` block with an `SRT_KM_S_BADSECRET` value.
The transmission in both directions will be "scrambled" (encrypted and
not decryptable).

The value of the encryption status can be retrieved from the
`SRTO_SNDKMSTATE` and `SRTO_RCVKMSTATE` options. The legacy (or
unidirectional) option `SRTO_KMSTATE` resolves to `SRTO_RCVKMSTATE`
by default, unless the `SRTO_SENDER` option is set to *true*, in which
case it resolves to `SRTO_SNDKMSTATE`.

The values retrieved from these options depend on the result of the KMX
process:

1. If only one party declares encryption, the KM state will be one of 
the following:

   - For the party that declares no encryption:
     - `RCVKMSTATE: NOSECRET`
     - `SNDKMSTATE: UNSECURED`
     - Result: This party can send payloads unencrypted, but it can't
     decrypt packets received from its peer.

   - For the party that declares encryption:
     - `RCVKMSTATE: UNSECURED`
     - `SNDKMSTATE: NOSECRET`
     - Result: This party can receive unencrypted payloads from its peer, and 
     will be able to send encrypted payloads to the peer, but the peer won't 
     decrypt them.

2. If both declare encryption, but they have different passwords, then both 
states are `SRT_KM_S_BADSECRET`. In such a situation both sides may send payloads, 
but the other party won't decrypt them.

3. If both declare encryption and the password is the same on both sides, then 
both states are `SRT_KM_S_SECURED`. The transmission will be correctly performed 
with encryption in both directions.

Note that due to the introduction of the bidirectional feature in HSv5 (and
therefore the Initiator and Responder roles), the old HSv4 method of initializing
the crypto objects used for security is used only in one of the directions.
This is now called **"forward KMX"**:

1. The Initiator initializes its Sender Crypto (TXC) with preconfigured values.
The SEK and SALT values are random-generated.
2. The Initiator sends a KMX message to the Receiver.
3. The Receiver deploys the KMX message into its Receiver Crypto (RXC)

This is the general process of Security Association done for the
"forward direction", that is, when done by the Sender. However, as there's
only one KMX process in the handshake, in HSv5 this must also initialize 
the crypto in the opposite direction. This is accomplished by **"reverse KMX"**:

1. The Initiator initializes its Sender Crypto (TXC), like above, and then
**clones it** to the Receiver Crypto.
2. The Initiator sends a KMX message to the Responder.
3. The Responder deploys the KMX message into its Receiver Crypto (RXC)
4. The Responder initializes its Sender Crypto by **cloning** the Receiver
Crypto, that is, by extracting the SEK and SALT from the Receiver Crypto
and using them to initialize the Sender Crypto (clone the keys).

This way the Sender (being a Responder) has the Sender Crypto initialized in a 
manner very similar to that of the Initiator. The only difference is that the SEK
and SALT parameters in the crypto:
 - are random-generated on the Initiator side
 - are extracted (on the Responder side) from the Receiver Crypto, which was
configured by the incoming KMX message

The extra operations defined as "reverse KMX" happen exclusively in the HSv5 
handshake.

The encryption key (SEK) is normally configured to be refreshed after a 
predefined number of packets has been sent. To ensure the "soft handoff" 
to the new key, this process consists of three activities performed in order:

1. Pre-announcing of the key (SEK is sent by Sender to Receiver)
2. Switching the key (at some point packets are encrypted with the new key)
3. Decommissioning the key (removing the old, unused key)

Pre-announcing is done using an SRT Extended Message with the `SRT_CMD_KMREQ` 
extended type, where only the "forward KMX" part is done. When the transmission 
is bidirectional, the key refreshing process happens completely independently 
for each direction, and it's always initiated by the sending side, independently 
of Initiator and Responder roles (actually, these roles are significant only up 
to the moment when the connection is considered established).

The decision as to when exactly to perform particular activities belonging to 
the key refreshing process is made when the **number of sent packets** exceeds
a certain value (up to the moment of the connection or previous refresh), which 
is controlled by the `SRTO_KMREFRESHRATE` and `SRTO_KMPREANNOUNCE` options:

1. Pre-announce: when # of sent packets > `SRTO_KMREFRESHRATE - SRTO_KMPREANNOUNCE`
2. Key switch: when # of sent packets > `SRTO_KMREFRESHRATE`
3. Decommission: when # of sent packets > `SRTO_KMREFRESHRATE + SRTO_KMPREANNOUNCE`

In other words, `SRTO_KMREFRESHRATE` is the exact number of transmitted packets
for which a key switch happens. The Pre-announce happens `SRTO_KMPREANNOUNCE`
packets earlier, and Decommission happens `SRTO_KMPREANNOUNCE` packets later.
The `SRTO_KMPREANNOUNCE` value serves as an intermediate delay to make sure
that from the moment of switching the keys the new key is deployed on the 
Receiver, and that the old key is not decommissioned until the last
packet encrypted with that key is received.

The following activities occur when keys are refreshed:

1. **Pre-announce:** The new key is generated and sent to the Receiver
using the SRT Extended Message `SRT_CMD_KMREQ`. The received key is deployed
into the Receiver Crypto. The Receiver sends back the same message through
`SRT_CMD_KMRSP` as a confirmation that the refresh was successful (if it wasn't,
the message contains an error code).

2. **Key Switch:** The Encryption Flags in the `PH_MSGNO` field get toggled
between `EK_EVEN` and `EK_ODD`. From this moment on, the opposite (newly
generated) key is used.

3. **Decommission:** The old key (the key that was used with the previous flag
state) is decommissioned on both the Sender and Receiver sides. The place
for the key remains open for future key refreshing.

**NOTE** The handlers for `KMREQ` and `KMRSP` are the same for handling the 
request coming through an SRT Extended Message and through the handshake 
extension blocks, except that in case of the SRT Extended Message only one 
direction (forward KMX) is updated. HSv4 relies only on these messages, so 
there's no difference between initial and refreshed KM exchange. In HSv5 the 
initial KM exchange is done within the handshake in both directions, and then 
the key refresh process is started by the Sender and it updates the key for one
direction only.

[Return to top of page](#srt-handshake)


#### Congestion controller

This is a feature supported by HSv5 only. This adds functionality that has
existed in UDT as "Congestion control class", but implemented with SRT
workflows and requirements in mind. In SRT, the congestion control
mechanism must be set the same on both sides and is identified by
a character string. The extension type is set to `SRT_CMD_CONGESTION`.

The extension block contains the length of the content in 4-byte words.
The content is encoded as a string extended to full 4-byte chunks with
padding NUL characters if needed, and then inverted on each 4-byte mark.
For example, a "STREAM" string would be extended to `STREAM@@` and then
inverted into `ERTS@@MA` (where `@` marks the NUL character).

The value is a string with the name of the SRT Congestion Controller type. The
default one is called "live". The SRT 1.3.0 version contains an additional
optional Congestion Controller type called "file". Within the "file" Congestion
Controller it is possible to designate a stream mode and a message mode (the
"live" one may only use the message mode, with one message per packet).

This extension is optional and when not present the "live" Congestion Controller
is assumed. For an HSv4 party, which doesn't support this feature, it is always
the case.

The "file" type reintroduces the old UDT features for stream transmission 
(together with the `SRT_OPT_STREAM` flag) and messages that can span multiple UDP 
packets. The Congestion Controller controls the way the transmission is
handled, how various transmission settings are applied, and how to handle any
special phenomena that happen during transmission.

The "file" Congestion Controller is based completely on the original `CUDTCC`
class from UDT, and the rules for congestion control are completely copied from
there. However, it contains many changes and allows the selection of the
original UDT code in places that have been modified in SRT to support live
transmission.

[Return to top of page](#srt-handshake)


#### Stream ID (SID)

This feature is supported by HSv5 only. Its value is a string of
the user's choice that can be passed from the Caller to the Listener. The
symbol for this extension is `SRT_CMD_SID`.

The extension block for this extension is encoded the same way as described
for Congestion Controler above.

The Stream ID is a string of up to 512 characters that a Caller can pass to a
Listener (it's actually passed from an Initiator to a Responder in general, but
in Rendezvous mode this feature doesn't make sense). To use this feature, an
application should set it on a Caller socket using the `SRTO_STREAMID` option.
Upon connection, the accepted socket on the Listener side will have exactly the
same value set, and it can be retrieved using the same option. For more details
about the prospective use of this option, please refer to the
[API description document](API.md) and [SRT Access Control guidelines](AccessControl.md).


[Return to top of page](#srt-handshake)