WebRTC for the Streamer

We wrote this site to explain/share the ways that we believe WebRTC (WHIP/WHEP) can improve streaming.

These things are available in other protocols (SRT, MoQ, RIST). The protocol itself isn't the important part. We would just like streamers to know what is possible. These improvements can change so many things around streaming for the better.

WebRTC for the Streamer was written by the developers who added WHIP/WebRTC support to OBS and maintain Broadcast Box. A companion piece to this site is WebRTC for the Curious. The source for this site lives on GitHub.

Streaming can be better... Read, Watch or Try it Now!

Watch

Read

Speed

WebRTC has ~200ms of latency. Interact with your viewers and friends like a video call. Everything is more connected and fun.

Privacy

WebRTC has End-to-End Encryption (E2E). When enabled, servers can't watch or tamper with your stream. Only end users can decrypt it.

Everyone Streams

WebRTC can publish from a web browser directly, no download necessary. Makes it easier to convince a friend to stream or bring entirely new types of people into streaming that couldn't do it before.

Higher Quality

WebRTC supports modern audio and video codecs like AV1, which can deliver about 30% better quality at the same bitrate.

Self Hosted

WebRTC uses one protocol for publishing and playback. Since WebRTC is widely used outside of streaming, lots of open source servers already exist.

Stream Anywhere

Stream on cellular, satellite, or bad Wi-Fi. WebRTC is designed to adapt to changing network conditions.

Speed

With WebRTC you get sub-500ms latency, you will get the experience of a video call. Having this latency can change the dynamics of streaming.

Streaming together

Streaming to a private group of friends is more connected when the latency is lower. It's a lot of fun to recreate the "sitting on the couch together" experience when you stream gameplay/movie to your friends.

Co-streaming to an audience

When co-streaming to an audience you want the lowest latency possible. It allows you to have authentic conversations with the other streamer, instead of an awkward back and forth. High latency leads to desync between you and your partner's gameplay. It is confusing as a viewer to see events happening at different times on the two feeds.

Audience interaction

WebRTC allows you to respond to chat like a real conversation. It feels like a more connected/human experience to talk with people directly, and not responding seconds later. The audience interaction doesn't have to be text only. Some games allow the audience to change the game environment itself. Seeing it instantly react when they press the button is kind of magical.

Privacy

WebRTC provides APIs that lets broadcasters encrypt media and viewers decrypt it so the server has no access to the video. The server can support all different types of clients thanks to simulcast.

            flowchart LR
                Broadcaster[Broadcaster]
                Server[WebRTC Server]
                ViewerA[Viewer A]
                ViewerB[Viewer B]

                Broadcaster <-->|P2P key exchange| ViewerA
                Broadcaster <-->|P2P key exchange| ViewerB
                Broadcaster -->|Encrypted media| Server
                Server -->|Encrypted media| ViewerA
                Server -->|Encrypted media| ViewerB

Self Hosted

WebRTC has quite a few self-hosting options. This has happened for a few reasons.

Wide usage outside of streaming

WebRTC is widely used outside of broadcasting. It is used for robotics, conferencing, "AI voice assistants" and more. So it can benefit from the ecosystem that existed before WebRTC broadcasting.

One protocol for publish+playback

If you are using RTMP you have to use another protocol for playback (usually HLS/DASH). With WHIP and WHEP you can use WebRTC for both, which means fewer moving parts to run.

Cheaper to run/no transcoding

A WebRTC server just forwards media packets instead of transcoding the stream. It's a lot easier to deploy/manage/scale because of this.

Flexible topologies (P2P and Mesh)

WebRTC isn't limited to client-server. You can connect viewers directly (P2P) or in a Mesh. This makes self-hosting easier and cheaper since you don't always need a powerful central server to distribute media.

            graph LR
                A[OBS] --> B[Browser]

            graph LR
                A[OBS] --> B[User A]
                B --> C[User B]
                B --> D[User C]
                D --> E[User E]

Everyone Streams

Streaming from the browser increases accessibility. The video quality/composition won't be as good, but these voices are important.

Everyone can broadcast

Streaming today requires that you install dedicated software. When configuring your software you have to be aware of things like bitrate, codecs and watch your resource usage. Broadcasting from the browser significantly reduces the barrier of entry to streaming. So many new voices and types of streams will be available when it is opened to more people.

Browser is everywhere

A web browser is available everywhere. Phones, TVs, tablets and smart cars etc... this allows you to broadcast from all these places where it wasn't available before. Also many people are using computers where they aren't able to install additional software. It would be great to enable them to stream even if they don't have root access to the machine.

Stream Anywhere

WebRTC gives a lot of flexibility in how you can stream. You can configure it to have the lowest latency possible (at the expense of video quality) or you can run it over TCP and have perfect video quality but higher latency. These are some of the knobs that WebRTC gives you.

Protocol choice (TCP or UDP)

WebRTC allows you to choose per session if you want TCP or UDP. If you pick TCP you will have zero packet loss, but may experience higher delay. If you pick UDP you get more control over the experience. You can use things like FEC+NACK to accommodate for a poor network, but still keep lowest latency possible.

Sender driven bandwidth estimation

With WebRTC a broadcaster can dynamically change bitrate if needed. WebRTC has a mechanism built into the protocol that is constantly measuring packet loss and delivery time RFC 8888. This means instead of setting a static bitrate you can dynamically change to get the best experience possible for your network/hardware.

This is a simplified example. Broadcasting software starts at 1080p and tries to upgrade to 2160p. If that results in a bad experience it drops back to 1080p. WebRTC provides the receiver feedback needed to make these decisions.

            sequenceDiagram
                OBS->>Server: Sending 1080p
                Server-->>OBS: Zero Packet Loss, 50ms trip time
                OBS->>Server: Sending 2160p
                Server-->>OBS: Packet Loss, 150ms trip time
                OBS->>Server: Sending 1080p
                Server-->>OBS: Zero Packet Loss, 50ms trip time

Forward error correction (FEC)

Forward error correction allows you to send redundancy/duplicated info ahead of time so packet loss has no impact on the stream. It consumes extra bandwidth, but is a great solution if you are running over satellite/cellular and have bandwidth available but are combating packet loss.

            flowchart TD
                subgraph Sender
                    A[Video Frame A]
                    B[Video Frame B]
                    C[Video Frame C]
                    A2[Video Frame A]
                    B2[Video Frame B]
                    C2[Video Frame C]
                end
                subgraph Receiver
                    ARecv[Video Frame A]
                    BRecv[Video Frame B]
                    CRecv[Video Frame C]

                end
                I((Internet))
                Sender-->I
                I-->Receiver


                style A fill:red
                style B2 fill:red
                style C fill:red
                style ARecv fill:green
                style BRecv fill:green
                style CRecv fill:green

Negative-acknowledgement (NACK)

NACK is another error correction technique. Instead of sending duplicated data ahead of time the receiver asks for missing packets again. This is a good fit when packet loss is small and the stream still has time to repair the frame. It uses less bandwidth than FEC, but if the network is already too delayed the resent packet might arrive too late to be useful.

            sequenceDiagram
                participant OBS
                participant Server
                OBS->>Server: Sending packet 101
                Note over OBS,Server: Packet 102 is lost
                OBS->>Server: Sending packet 103
                Server-->>OBS: NACK packet 102
                OBS->>Server: Re-send packet 102
                Server->>Server: Build video frame from 101, 102, 103

Mobility (ICE renomination)

Switching between WiFi/cellular used to require a full reconnect of the stream. With WebRTC you can switch networks without disconnecting anything. The network switch also allows for easier administration. Servers can be updated/restarted without requiring users to fully disconnect.

Connection bonding

WebRTC allows accepting video from multiple sources. You can combine multiple 5G/WiFi interfaces and send video over them. This technique is niche, but it opens up lots of interesting options. You can send your most important video feed over your most stable interface. Some users will also send their more latency sensitive media (audio) over one interface while using others for video.

Higher Quality

WebRTC does not hardcode video codecs into the protocol. Codecs are negotiated at runtime so other codecs (like HEVC and AV1) can be trivially added. AV1 is designed to deliver about 30% better quality at the same bitrate, so a 6 Mbps stream can look noticeably better without requiring more upload bandwidth. Custom codecs could also be added if both client/server supported so you can do custom things if you need.

Simulcast

When streaming video you will need to support different types of users. A phone on 5G works best with 1080p while a desktop computer on fiber can support 2160p. With simulcast the broadcaster generates and uploads all quality levels. The server then forwards the video feeds appropriately.

            flowchart LR
                A[OBS/FFmpeg]
                B[Server]
                C[Viewer]
                D[Viewer]
                E[Viewer]
                A --> |2160p|B
                A --> |1440p|B
                A --> |1080p|B
                B --> |2160p|C
                B --> |1440p|D
                B --> |1080p|E

            linkStyle 0,3 stroke: red
            linkStyle 1,4 stroke: green
            linkStyle 2,5 stroke: blue

Traditionally with broadcast software you would upload the 2160p feed, and the server would transcode down to the other layers. WebRTC's approach to this problem has a few benefits.

Better quality

A 1080p video stream generated via transcoding will not have the same quality as one generated in OBS directly. Transcoding suffers from generational loss. When video is decoded and re-encoded additional compression artifacts and loss of detail will happen. With simulcast you only encode once.

More control

When doing transcodes you only control the encoding quality of one stream. With simulcast you can ensure all your video streams are high quality.

Lower latency

Transcoding adds additional latency. Running an additional decode + encode for your transcoded feeds means they will have a different latency than your uploaded feed. With simulcast all your feeds run at the same latency.

Simpler servers

Running servers without transcoding is much easier. Transcoding requires a lot of computing power, while simulcast just means extra upload traffic.

No tampering/ad insertion

Simulcast means you pass through video feeds untampered. If the server is re-encoding your video it can insert watermarks/ads or modify the video in unexpected ways. With E2E encryption broadcasters can even have it that the server can't even decode your video. They will see a stream of bytes pass through, but will not actually be able to watch the video.

Try it now

This test stream uses the public Broadcast Box instance at b.siobud.com. Publish from this browser or OBS, then watch the WebRTC playback directly on this page or on b.siobud.com.

Publish

Not publishing.

Service WHIP

Server https://b.siobud.com/api/whip

Bearer Token

Generating stream key...

For lower latency in OBS, see Broadcast Box's OBS broadcasting notes for custom encoder settings.

Watch

Watch on b.siobud.com

Waiting for a broadcast.

How can I use WHIP/WHEP?

Clients

OBS Studio - desktop WHIP publishing.
FFmpeg - command-line WHIP publishing with the WHIP muxer.
GStreamer whipclientsink - pipeline-based WHIP publishing.
GStreamer whepclientsrc - pipeline-based WHEP playback.
Larix Broadcaster - mobile WHIP publishing from iOS and Android.
Eyevinn WHIP web client - browser-based WHIP publishing SDK.
whip-go - Go WHIP client and command-line publisher.
VDO.Ninja WHIP/WHEP tool - browser WHIP publishing and WHEP playback.
Meetecho Simple WHIP Client - native GStreamer-based WHIP test client.
Meetecho Simple WHEP Client - native GStreamer-based WHEP test client.

Servers

Broadcast Box - simple WHIP ingest and WHEP playback for self-hosted broadcasts.
MediaMTX - live media router with WHIP publishing and WHEP playback.
OvenMediaEngine - streaming server with WebRTC/WHIP ingest.
SRS - real-time media server with WHIP and WHEP HTTP APIs.
LiveKit Ingress - WHIP and RTMP ingress service for publishing into LiveKit rooms.
Janus Simple WHIP Server - WHIP server library backed by Janus.
Eyevinn WHIP - WHIP endpoint and browser client modules.
Nimble Streamer - software media server with WHIP ingest and WHEP playback

WebRTC for the Streamer

Watch

Read

Speed

Privacy

Everyone Streams

Higher Quality

Self Hosted

Stream Anywhere

Speed

Streaming together

Co-streaming to an audience

Audience interaction

Privacy

Self Hosted

Wide usage outside of streaming

One protocol for publish+playback

Cheaper to run/no transcoding

Flexible topologies (P2P and Mesh)

Everyone Streams

Everyone can broadcast

Browser is everywhere

Stream Anywhere

Protocol choice (TCP or UDP)

Sender driven bandwidth estimation

Forward error correction (FEC)

Negative-acknowledgement (NACK)

Mobility (ICE renomination)

Connection bonding

Higher Quality

Simulcast

Better quality

More control

Lower latency

Simpler servers

No tampering/ad insertion

Try it now

Publish

Watch

How can I use WHIP/WHEP?

Clients

Servers

Services with WHIP support