ICE Archives • testRTC

Your best WebRTC debugging buddy? The webrtc-internals API trace

This time, we take you through the webrtc-internals API trace to see what can you learn from it.

To make this article as accurate as possible, I decided to go to my source of truth for the low level stuff related to WebRTC – Philipp Hancke, also known as fippo or hcornflower. This in a way, is a joint article we’ve put together.

Before we do, though, you should probably check out the other articles in this series:

Now back to the API trace.

WebRTC is asynchronous

Here’s something you probably already noticed. WebRTC is asynchronous to the extreme. It almost painstakingly makes sure that whatever you are trying to achieve – you won’t be able to without multiple calls in different contexts of your JavaScript app in the browser.

It isn’t because the authors of WebRTC are mean. It is because the nature of communications is asynchronous. It is made worse by the various network topologies that require the use of curses like STUN, TURN and ICE and by the fact that we require the user to authorize things like accessing his camera.

This brings us to the tricky situation of error handling. With WebRTC, it takes place everywhere. Anything you do can fail twice:

When you call the API and it returns
When the callback/promise/event handler/whatever returns back with the result of your API call

This means that in many cases, you are going to be left with a half baked solution that looks at some of the error cases (did you ever see a sample that takes care of edge cases or failure scenarios?).

It also means that often times you’ll need to be able to debug them. And that’s what the API trace in webrtc-internals can help you with.

The webrtc-internals API trace

If you open chrome://webrtc-internals while in an active WebRTC session, you will immediately see the API trace:

WebRTC API trace sample

This is the list of API calls and events done on the peer connection, informing you of the progress and state of the connection.

You can click on any of these APIs to see its parameters.

Before we look at what kind of analysis we can derive from these traces, let’s look at what some of the connection methods and events do.

addStream: if this method was called, the Javascript code has added a MediaStream to the peerconnection. You can see the id of the stream as well as the audio and video tracks. onAddStream shows a remote stream being added, including the audio and video track ids, it is called between the setRemoteDescription call and the setRemoteDescriptionOnSuccess callback
createOffer shows any calls to this API including the options such as offerToReceiveAudio, offerToReceiveVideo or iceRestart. createOfferOnSuccess shows the results of the createOffer call, including the type (which should be ‘offer’ obviously) and the SDP resulting from it. createOfferOnFailure could also be called indicating an error but that is quite rare
createAnswer and createAnswerOnSuccess and createAnswerOnFailure are similar but with no additional options
setLocalDescription shows you the type and SDP used in the setLocalDescription call. If you do any SDP munging between createOffer and setLocaldescription you will see this here. This results in either a setLocalDescriptionOnSuccess or setLocalDescriptionOnFailure callback which shows any errors. The same applies to the setRemoteDescription and its callbacks, setRemoteDescriptionOnSuccess and setRemoteDescriptionOnFailure
onRenegotiationNeeded is the old chrome-internal name for the onnegotiationneeded event. If your app uses this you might want to look for it
onSignalingStateChange shows the changes in the signaling state as a result of calls to setLocalDescription and setRemoteDescription. See the wonderful diagram in the specification for the gory details. At the end of the day, you will want to be in the stable state most of the time
iceGatheringStateChange is the little brother of the ice connection state. It will show you the state of the ice gatherer. It will change to gathering after setLocalDescription if there are ICE candidates to gather
onnicecandidate events show all candidates gathered, with information for which m-line and MID. Likewise, the addIceCandidate method shows that information from the other side. Typically you should see both event types. See below for a more detailed discussion of these events
oniceconnectionstate is one of the most important event handlers. It tells you whether a peer-to-peer connection succeeded or not. From here, you can start searching for the active candidate as we explained in the previous post

The two basic flows we can see are that of something offering connections and answering. The offerer case will typically consist of these events:

WebRTC API Trace - offer side

(addStream if the local side wants to send media)
createOffer
createOfferOnSuccess
setLocalDescription
setLocalDescriptionOnSuccess
setRemoteDescription
(onaddstream if the remote end signalled media streams in the SDP)
setRemoteDescriptionOnSuccess

While the answerer case will have:

WebRTC API Trace - answer side

setRemoteDescription
(onaddstream if the remote end signalled media streams in the SDP)
createAnswer
createAnswerOnSuccess
setLocalDescription
setLocalDescriptionOnSuccess

In both cases there should be a number of onicecandidate events and addIceCandidate calls along with signaling and ice connection state changes.

Let us look at two specific cases next.

Example #1 – My WebRTC app works locally but not on a different network!

This is actually one of the most frequent questions on the discuss-webrtc list or on stackoverflow. Most of the time the answer is “you need a TURN server” and “no, you can not use some TURN server credentials that you found somewhere on the internet”.

So it works locally. That means that you are creating an offer, sending it to the remote side, calling setLocalDescription() and are getting an answer that you feed into setRemoteDescription(). It also means that you are getting candidates in the onicecandidate() event, sending them to the remote side and getting candidates from there which you call the addIceCandidate() method with.

And locally you get a oniceconnectionstatechange() event to connected or completed:

WebRTC API trace - ICE state changes

Great! You probably just copied and pasted these pieces of code from somewhere on github.

Now… why does it not work when you’re on a different network? On different networks, you need both a STUN and a TURN server. Check if your app is using a STUN and TURN server and that you’re passing them correctly at the top of webrtc-internals:

NAT configuration in webrtc-internals

As you can see (assuming you have good eyes), there are a number of ice servers used here. In the case of our screenshot, it’s Google’s apprtc sample. There is a stun server, stun:stun.l.google.com:19302. There are also four TURN servers:

turn:64.233.165.127:19305?transport=udp
turn:[2A00:1450:4010:C01::7F]:19305?transport=udp
turn:64.233.165.127:443?transport=tcp
turn:[2A00:1450:4010:C01::7F]:443?transport=tcp

As you can see, apprtc uses TURN over both UDP and TCP and is running TURN servers for both IPv4 and IPv6.

Now just because you configured a TURN server does not mean there won’t be any errors. The TURN server might not be reachable. Or your credentials might not work (this will happen if you “found” the credentials on a list of “free public servers”). In order to verify that the STUN and TURN servers you use actually work you need to look at the onicecandidate() events.

If you use a STUN or a TURN server, you should see a onicecandidate() event with a candidate that has a ‘typ srflx’.

Similarly, if you use a TURN server, you need to check if you get an onicecandidate() event where the candidate has a ‘typ relay’.

Note that Chrome stops gathering candidates once it establishes a connection. But if your connection gets established you are probably not going to debug this issue.

If you get both of these you’re fine. But you also need to check what candidates your peer sent you with which addIceCandidate() was called.

Example #2 – The network is blocking my connection

Networks that block UDP traffic are quite common. TURN/TCP and TURN/TLS (as well as ICE-TCP even though we mention this mostly to make Emil Ivov happy) provide a way to enable calls even on those networks. This has some effect on the media quality as we discussed previously but let us see how we can detect whether we are on a network that is blocking UDP traffic to begin with.

If you want to follow along, open webrtc-internals and the webrtc candidate gathering demo page and start gathering. By default, it uses one of Google’s STUN servers. To keep things simple, uncheck the “gather IPv6 candidates” and “gather RTCP candidates” boxes before clicking on the “gather candidates” button:

Uncheck ICE gathering candidates

On webrtc-internals you will see a createOffer call with offerToReceiveAudio set to true (this is to create an m-line and gather candidates for it):

WebRTC ICE gathering receive audio

Followed by a createOfferOnSuccess and a setLocalDescription call. After that there will be a couple of onicecandidate events and an icegatheringstatechange to completed, followed by a stop call.

There should be an onicecandidate with a candidate that has a “typ srflx” in it:

ICE candidate types

It shows your public ip. If you don’t get such a candidate but only host candidates, either the STUN server is not working (which in the case of Google’s STUN server is somewhat unlikely) or your network is blocking UDP.

Now block UDP on your network (but mind you, do not block port 53 for DNS). If you don’t know a quick way to block UDP, lets try to simulate that by changing the stun server to something that will not respond, in this case Google’s well-known DNS server running at 8.8.8.8:

Fudging STUN configuration for WebRTC

Click “gather candidates” again. After around 10 seconds you will see a gathering state change to completed in webrtc-internals. But you will not see a server-reflexive candidate:

ICE negotiation with no STUN connectivity

You can try the same thing with a TURN UDP server. Make sure your credentials are valid (again, the “public TURN server list” is not a thing). This will show both a srflx and a relay candidate.

One of the nice tricks you can do here is to change the password to something invalid. Then you will only get a srflx but no relay candidate. Which is a nice and easy way to detect if your credentials are invalid — the candidates page even suggests this.

You can repeat this with TURN/TCP and TURN/TLS servers. You can even add all kinds of TURN servers and then use the priority trick we have shown in the last blog post to figure out from which servers you gathered candidates.

If you don’t get anything but host candidates you might be on a network which blocks both UDP traffic and is successful at blocking TURN/TCP and TURN/TLS. One scenario where that might happen currently is if there is a proxy that requires authentication which is not yet supported by Chrome.

Now let us take a step back. When is this useful? In a real-world scenario you will want to run with all kinds of STUN and TURN servers, otherwise you will get high failure rates. If you need to debug a failure to establish a connection, you should look for the onicecandidate and addIceCandidate events. They will allow you to figure out if the local or remote client was on a network that blocked it from establishing a connection to any peer outside the network.

What’s next?

So this time around, we’ve focused on the API traces:

We’ve acquainted with the fact that this is something that webrtc-internals does us a great service just by capturing all of these WebRTC API calls
We even went through the typical API calls and flows that are expected to appear in the WebRTC API trace
We’ve looked at two examples where the WebRTC API trace can help us debug the problems we’re seeing (there are more)
- #1 – misconfiguration of NAT traversal servers
- #2 – network blocking and the forgotten TURN/TCP configuration

We’re not done yet with this series. We still have one or more articles in the pipeline to close the basics of what webrtc-internals got up its sleeves.

If you are interested in keeping up with us, you might want to consider subscribing.

Huge thanks for Fippo in assisting with this series!

What happens when WebRTC shifts to TURN over TCP

You wouldn’t believe how TURN over TCP changes the behavior of WebRTC on the network.

I’ve written this on BlogGeek.me about the importance of using TURN and not relying on public IP addresses. What I didn’t cover in that article was how TURN over TCP changes the behavior we end up seeing on the network.

This is why I took the time to sit down with AppRTC (my usual go-to service for such examples), used a 1080p resolution camera input, configure my network around it using testRTC and check what happens in the final reports we get.

What I want to share here are 4 different network conditions:

Checking how TURN over TCP affects the network flow

#1 – A P2P Call with No Packet Loss

Let’s first figure out the baseline for this comparison. This is going to be AppRTC, 1:1 call, with no network impairments and no use of TURN whatsoever.

Oh – and I forced the use of VP8 on all calls while at it. We will focus on the video stats, because there’s a lot more data in them.

P2P; No packet loss; charts

Our outgoing bitrate is around 2.5Mbps while the incoming one is around 2.3Mbps – it has to do with the timing of how we calculate things in testRTC. With longer calls, it would average at 2.5Mbps in both directions.

Here’s how the video graphs look like:

P2P; No packet loss; graphs

They are here for reference. Once we will analyze the other scenarios, we will refer back to this one.

What we will be interested in will mainly be bitrate, packet loss and delay graphs.

#2 – TURN over TCP call with No Packet Loss

At first glance, I was rather put down by the results I’ve seen on this one – until I dug into it a bit deeper. I forced TCP relay by blocking all UDP traffic in our machines.

TURN over TCP; No packet loss; charts

This time, we have slightly lower bitrates – in the vicinity of 2.4Mbps outgoing and 2.2Mbps incoming.

This can be related to the additional TURN leg, its network and configuration – or to the overhead introduced by using TCP for the media instead of UDP.

The average Round trip and Jitter vaues are slightly higher than those we had without the need for TURN over UDP – a price we’re paying for relaying the media (and using TCP).

The graphs show something interesting, but nothing to “write home about”:

TURN over TCP; No packet loss; graphs

Lets look at the video bitrate first:

TURN over TCP; No packet loss; video bitrate

Look at the yellow part. Notice how the outgoing video bitrate ramps up a lot faster than the incoming video bitrate? Two reasons why this might be happening:

WebRTC sends out data fast, but that same data gets clogged by the network driver – TCP waits before it sends it out, trying to be a good citizen. When UDP is used, WebRTC is a lot more agressive (and accurate) about estimating the available bitrate. So on the outgoing, WebRTC estimates that there’s enough bitrate to use, but then on the incoming, TCP slows everything down, ramping up to 2.4Mbps in 30 seconds instead of less than 5 that we’re used to by WebRTC
The TURN server receives that data, but then somehow decides to send it out in a slower fashion for some unknown reason

I am leaning towards the first reason, but would love to understand the real reason if you know it.

The second interesting thing is the area in the green. That interesting “hump” we have for the video, where we have a jump of almost a full 1Mbps that goes back down later? That hump also coincides with packet loss reporting at the beginning of it – something that is weird as well – remember that TCP doesn’t lose packets – it re-transmits them.

This is most probably due to the fact that after bitstream got stabilized on the outgoing side, there’s the extra data we tried pushing into the channel that needs to pass through before we can continue. And if you have to ask – I tried a longer 5 minutes session. That hump didn’t appear again.

Last, but not least, we have the average delay graph. It peaks at 100ms and drops down to around 45ms.

To sum things up:

TURN over TCP causes WebRTC sessions to stabilize later on the available bitrate.

Until now, we’ve seen calls on clean traffic. What happens when we add some spice into the mix?

#3 – A P2P Call with 0.5% packet loss

What we’ll be doing in the next two sessions is simulate DSL connections, adding 0.5% packet loss. First, we go back to our P2P call – we’re not going to force TURN in any way.

P2P; 0.5% packet loss; charts

Our bitrate skyrocketed. We’re now at over 3Mbps for the same type of content because of 0.5% packet loss. WebRTC saw the opportunity to pump more bits to deal with the network and so it did. And since we didn’t really limit it in this test – it took the right approach.

I double checked the screenshots of our media – they seemed just fine:

P2P; 0.5% packet loss; screenshot

Lets dig a bit deeper into the video charts:

P2P; 0.5% packet loss; graphs

There’s packet loss alright, along with higher bitrates and slightly higher delay.

Remember these results for our final test scenario.

#4 – TURN over TCP Call with 0.5% packet loss

We now use the same configuration, but force TURN over TCP over the browsers.

Here’s what we got:

TURN over TCP; 0.5% packet loss; charts

Bitrates are lower than 2Mbps, whereas on without forcing TURN they were at around 3Mbps.

Ugliness ensues when we glance at the video charts…

TURN over TCP; 0.5% packet loss; graphs Things don’t really stabilize… at least not in a 90 seconds period of a session.

I guess it is mainly due to the nature of TCP and how it handles packet losses. Which brings me to the other thing – the packet loss chart seems especially “clean”. There are almost no packet losses. That’s because TCP hides that and re-transmit everything so as not to lose packets. It also means that we have utilization of bitrate that is way higher than the 1.9Mbps – it is just not available for WebRTC – and in most cases, these re-tramsnissions don’t really help WebRTC at all as they come too late to play them back anyway.

What did we see?

I’ll try to sum it in two sentences:

TCP for WebRTC is a necessary evil
You want to use it as little as possible

And if you are interested about the most likely ICE candidate to connect, then checkout Fippo’s latest data nerding post.

Tag Archives for " ICE "