The RTP Control Protocol is a sister protocol of the Real-time Transport Protocol. Its basic functionality and packet structure is defined in RFC 3550. RTCP provides out-of-band statistics and control information for an RTP session. It partners with RTP in the delivery and packaging of multimedia data, but does not transport any media data itself. The primary function of RTCP is to provide feedback on the quality of service in media distribution by periodically sending statistics information such as transmitted octet and packet counts, packet loss, packet delay variation, and round-trip delay time to participants in a streaming multimedia session. An application may use this information to control quality of service parameters, perhaps by limiting flow, or using a different codec.
Protocol functions
Typically RTP will be sent on an even-numbered UDP port, with RTCP messages being sent over the next higher odd-numbered port. RTCP itself does not provide any flow encryption or authentication methods. Such mechanisms may be implemented, for example, with the Secure Real-time Transport Protocol defined in RFC 3711. RTCP provides basic functions expected to be implemented in all RTP sessions:
The primary function of RTCP is to gather statistics on quality aspects of the media distribution during a session and transmit this data to the session media source and other session participants. Such information may be used by the source for adaptive media encoding and detection of transmission faults. If the session is carried over a multicast network, this permits non-intrusive session quality monitoring.
RTCP provides canonical end-point identifiers to all session participants. Although a source identifier of an RTP stream is expected to be unique, the instantaneous binding of source identifiers to end-points may change during a session. The CNAME establishes unique identification of end-points across an application instance and for third-party monitoring.
Provisioning of session control functions. RTCP is a convenient means to reach all session participants, whereas RTP itself is not. RTP is only transmitted by a media source.
RTCP reports are expected to be sent by all participants, even in a multicast session which may involve thousands of recipients. Such traffic will increase proportionally with the number of participants. Thus, to avoid network congestion, the protocol must include session bandwidth management. This is achieved by dynamically controlling the frequency of report transmissions. RTCP bandwidth usage should generally not exceed 5% of total session bandwidth. Furthermore, 25% of the RTCP bandwidth should be reserved to media sources at all times, so that in large conferences new participants can receive the CNAME identifiers of the senders without excessive delay. The RTCP reporting interval is randomized to prevent unintended synchronization of reporting. The recommended minimum RTCP report interval per station is 5 seconds. Stations should not transmit RTCP reports more often than once every 5 seconds.
Packet header
Version: Identifies the version of RTP, which is the same in RTCP packets as in RTP data packets. The version defined by this specification is two.
P : Used to indicate if there are extra padding bytes at the end of the RTP packet. A padding might be used to fill up a block of certain size, for example as required by an encryption algorithm. The last byte of the padding contains the number of padding bytes that were added.
RC : The number of reception report blocks contained in this packet. A value of zero is valid.
PT : Contains a constant to identify RTCP packet type.
Length: Indicates the length of this RTCP packet.
SSRC: Synchronization source identifier uniquely identifies the source of a stream.
Message types
RTCP distinguishes several types of packets: sender report, receiver report, source description, and goodbye. In addition, the protocol is extensible and allows application-specific RTCP packets. A standards-based extension of RTCP is the extended report packet type introduced by RFC 3611. ;Sender report : The sender report is sent periodically by the active senders in a conference to report transmission and reception statistics for all RTP packets sent during the interval. The sender report includes an absolute timestamp, which is the number of seconds elapsed since midnight on January 1, 1970. The absolute timestamp allows the receiver to synchronize RTP messages. It is particularly important when both audio and video are transmitted simultaneously, because audio and video streams use independent relative timestamps. ;Receiver report : The receiver report is for passive participants, those that do not send RTP packets. The report informs the sender and other receivers about the quality of service. ;Source description : The Source Description message is used to send the CNAME item to session participants. It may also be used to provide additional information such as the name, e-mail address, telephone number, and address of the owner or controller of the source. ;Goodbye : A source sends a BYE message to shut down a stream. It allows an endpoint to announce that it is leaving the conference. Although other sources can detect the absence of a source, this message is a direct announcement. It is also useful to a media mixer. ;Application-specific message : The application-specific message provides a mechanism to design application-specific extensions to the RTCP protocol.
Scalability in large deployments
In large-scale applications, such as in Internet Protocol Television, very long delays between RTCP reports may occur, because of the RTCP bandwidth control mechanism required to control congestion. Acceptable frequencies are usually less than one per minute. This affords the potential of inappropriate reporting of the relevant statistics by the receiver or cause evaluation by the media sender to be inaccurate relative to the current state of the session. Methods have been introduced to alleviate the problems: RTCP filtering, RTCP biasing and hierarchical aggregation.
Hierarchical aggregation
The Hierarchical Aggregation is an optimization of the RTCP feedback model and its aim is to shift the maximum number of users limit further together with quality of service measurement. The RTCP bandwidth is constant and takes just 5% of session bandwidth. Therefore, the reporting interval about QoS depends, among others, on a number of session members and for very large sessions it can become very high. However the acceptable interval is about 10 seconds of reporting. Bigger values would cause time-shifted and very inaccurate reported status about the current session status and any optimization made by sender could even have a negative effect to network or QoS conditions. The Hierarchical Aggregation is used with Source-Specific Multicast where only a single source is allowed, i.e. IPTV. Another type of multicast could be Any-Source Multicast but it is not so suitable for large-scale applications with huge number of users. , only the most modern IPTV systems use Hierarchical aggregation.
Feedback Target
Feedback Target is a new type of member that has been firstly introduced by the Internet Draft draft-ietf-avt-rtcpssm-13. The Hierarchical Aggregation method has extended its functionality. The function of this member is to receive Receiver Reports and retransmit summarized RR packets, so-called Receiver Summary Information to a sender.
Standards documents
, Standard 64, RTP: A Transport Protocol for Real-Time Applications