RTP

RTP (Real-Time Transport Protocol) is the protocol responsible for delivering real-time audio, video, and other types of multimedia data over IP networks. While SIP handles the signaling (setting up, managing, and tearing down the communication sessions) RTP is used to actually transmit the real-time media (like voice in VoIP calls or video in a video call) during SIP sessions between the communicating endpoints.

How RTP Works with SIP

  1. Session Establishment:
    • SIP negotiates and sets up a session between two or more endpoints, agreeing on the media types (audio, video, etc.) and codecs to be used.
    • During this negotiation, SIP uses SDP (Session Description Protocol) to exchange media-related information, including the ports and IP addresses where each endpoint should send and receive RTP streams.
  2. Media Transmission:
    • Once the SIP session is established, RTP takes over for the actual transmission of the real-time media streams.
    • RTP carries the encoded media packets between the endpoints using UDP (usually over ports in the range 16384–32767).
    • Each media stream (audio, video) is encapsulated in its own RTP stream. For instance, in a video call, there would be one RTP stream for audio and another for video.
  3. Synchronization and Sequencing:
    • RTP ensures the correct ordering and timing of media packets, even if they arrive out of order due to network jitter.
    • RTP headers include a sequence number to reorder packets if needed, and a timestamp to synchronize audio and video streams properly.
  4. Payload Types:
    • RTP supports multiple codecs, and each packet contains a payload type identifier that indicates what codec is being used (e.g., G.711 for audio, H.264 for video). This is agreed upon during the SIP session negotiation.
  5. QoS and Real-Time Characteristics:
    • RTP includes mechanisms to monitor and optimize Quality of Service (QoS) for media delivery. Although it doesn’t guarantee delivery like TCP, it’s optimized for real-time transmission, prioritizing low latency over packet recovery.

RTP Control Protocol (RTCP)

Alongside RTP, there’s RTCP (RTP Control Protocol), which works in parallel to provide feedback on the quality of the media transmission. RTCP allows participants to monitor things like packet loss, jitter, and round-trip times.

RTCP doesn’t carry media but instead sends statistics about the quality of the RTP stream, allowing for real-time adjustments to improve media quality.

Key Points

  • SIP establishes the session, while RTP transports the media.
  • RTP works over UDP to ensure low-latency delivery, which is crucial for real-time applications like VoIP and video calls while SIP can work on TCP or UDP or even SCTP.
  • RTCP provides quality feedback, helping optimize the media delivery experience.
  • RTP headers ensure that media streams stay synchronized, ordered, and properly processed.