Your application has its own set internal logic that is outside the scope and context of WebRTC itself. As such, the watchRTC SDK cannot be aware of these events and track them on its own. In order to be able to enrich watchRTC with such events, making it more powerful and useful when you need to analyze certain sessions, you can use the custom events mechanism.
By using watchRTC.addEvent() API available in the watchRTC SDK, you can instruct watchRTC to add such events to its tracking of the session.
A custom event in watchRTC has 3 parameters to it: name, type and optional parameters. The name is an arbitrary string describing the event while the type indicates how watchRTC should treat this event. There are 3 types of events:
log – this event will appear only on the event log in the Advanced WebRTC Analytics for the peer and nowhere else. You’ll be able to track and correlate it along with all other API calls and callbacks that watchRTC tracks
local – this event will appear in the event log in the Advanced WebRTC Analytics as well as on the peer level charts. When you will drill down to look at the peer information, the event will show as a vertical line on the charts
global – this event will appear in the event log in the Advanced WebRTC Analytics, the peer level charts and the room level charts
The optional parameters are additional custom data you wish to store for the event. This data will appear in the events log.
Here is how custom events appear in the events log:
To operate properly, qualityRTC and probeRTC collect information from users’ machines that is considered PII. This mainly includes IP addresses and at times email addresses as well. This information is encrypted and is never shared with third parties.
It is important to understand that the data collected is needed in order for support personnel to assist end users with troubleshooting their connectivity and media quality issues. To safeguard that information further, qualityRTC and probeRTC can be configured to remove, keep or obfuscate PII data automatically after a set number of days.
The collection and storage of such information is done using encryption at transit and at rest: the data is sent encrypted and is also stored encrypted.
Automatically scrubbing PII
You can configure PII to be scrubbed or removed in the Settings section under the relevant tabs for qualityRTC or probeRTC.
After a configurable number of days from the moment the entry was created, qualityRTC or probeRTC will scrub and remove any email addresses and IP addresses that it finds in that entry. This includes whatever is found on the dashboard as well as in the detailed logs.
Example
The image below shows an example of a scrubbed entry in qualityRTC:
Scrub PII Using URL Parameters
For qualityRTC, the option to obfuscate or remove PII can also be decided per test link using URL parameters.
OpenVidu is an open source video conferencing framework with a ready-to-use UI.
If you are using it as your baseline, then the script below is a good starting point.
Preparation
The script below creates random rooms on OpenVidu. If you are using user authentication or need other means to create and join rooms, then you will need to edit this script a bit further. Otherwise, you can just use the script “as is” – just configure the probes the way you want it before running it.
Using the test script
In testRTC, create a new test script:
Copy the code from the bottom of this article to your test script (or use the existing sample in your account)
Decide the number of probes you want to use
2 Concurrent probes will be good starting point
Set the Session size to the number of concurrent probes for this sample
Replace the Service URL of the script with the URL where your OpenVidu server is located. The sample script was created in front of this URL: https://demos.openvidu.io/openvidu-call/#/
Test execution
Run the script. It does everything for you.
If you want, you can join the same room from your browser once you see the URL in the progress messages of the test.
Test script code
/*
SCENARIO
* Browser joins room URL
* Browser waits in a call for a few minutes
THINGS TO PLAY WITH
* Probe configurations (look after the script):
- Location of probes
- Media files to use
- Browser version
* Length of test
* Modify to run on your own service
*/// Variables that we will use in this examplevar roomUrl = process.env.RTC_SERVICE_URL + 'testRTC-' + process.env.RTC_SESSION_NAME;
var probeType = Number(process.env.RTC_IN_SESSION_ID);
var sec = 1000;
// Join the room
client
.pause(probeType * 200) //Staggering
.rtcInfo(roomUrl)
.rtcProgress('open ' + roomUrl)
.url(roomUrl)
.waitForElementVisible('#videoRoomNavBar #joinButton', 60 * sec)
.pause(300) //wait for page render
.click("#videoRoomNavBar #joinButton")
client
.pause(60 * sec)
.rtcScreenshot("in call")
.pause(60 * sec);Code language:JavaScript(javascript)
Often times, we find ourselves integrating qualityRTC with proprietary infrastructure of our clients. This may be credentials to TURN servers, configuration and access of media servers, dynamic lists of DNS addresses, etc.
To make the integration easier, we’ve created a standardized REST API call that we invoke from our backend to the client’s backend which collects that information each time a user runs a qualityRTC network test. To make the integration easier and smoother, it is advisable to implement this REST API according to the following rules:
Request
The request is an HTTPS request sent from our backend to your backend.
The end user’s browser is not part of that API call, but the response returned by this REST API call will be sent to the end user’s browser for the purpose of conducting the network tests.
You can choose an arbitrary URL and you will need to support an optional region URL variable. One of the straightforward ways to implement this REST API is by way of an AWS Lambda or other serverless infrastructure.
The above request will ask for the configuration associated with a region called us-east. If no region is provided, then you should give a default region.
Response
The response of your REST API call should be in a form of a JSON structure:
Each of the blocks in the JSON structure is optional. You should implement and return only the ones relevant to your specific infrastructure. Anything else will be statically provisioned by qualityRTC through configuration defined during the initial setup of the service.
turnCredentials
If you are using your own TURN servers (or our installable ones), then you will need to pass the TURN servers and their credentials to us using this variable.
The TURN credentials provided will be used wherever they are needed in the tests conducted, unless other specific credentials are given for a test.
The url field returns as a list of servers. qualityRTC can pick and choose between them using AWS Route 53 or similar logical services if needed.
videoQuality
When using your own media servers for the VIDEO QUALITY test, such as Jitsi, Janus, mediasoup, …; you can provide the access information necessary to connect to your media server for the purpose of testing.
dnsLookupDomains
If you need to dynamically change and manage the DNS LOOKUP test then you can use this to generate the list of addresses to test against.
The watchRTC SDK works similarly to webrtc-internals, but collects at slightly different time intervals.
webrtc-internals can collect at a frequency of a second. For most of our watchRTC clients, we configure the collection frequency of the SDK to 8 seconds or more. We are trying to keep a delicate balance here: on one hand, we want to get as much data and at a lower frequency, on the other hand we don’t want our collection to take up too much CPU or network resources.
Since watchRTC usually collects its metrics at frequencies of over a second, it misses some of the granularity that webrtc-internals can achieve, averaging out metrics a bit more.
By default, profiles are allocated for the probes needed for the test in a round-robin fashion. This means that profiles are allocated sequentially to the probes one after the other, and once no more profiles are available, the allocation algorithm will start from the first profile again for the next probe.
Assume we have 10 probes configured in a test with 4 profiles. Here is how the allocation of profiles to probes will be:
Probe number
Profile allocated
Probe #1
Profile #1
Probe #2
Profile #2
Probe #3
Profile #3
Probe #4
Profile #4
Probe #5
Profile #1
Probe #6
Profile #2
Probe #7
Profile #3
Probe #8
Profile #4
Probe #9
Profile #1
Probe #10
Profile #2
Notice that testRTC ignores the session size you configure for the test script altogether when matching probes to profiles.
Randon probes allocation
If you would like, you can instruct testRTC to allocate the profiles to probes in a test randomly.
This can be done by adding the #random directive to the run options of your script.
For some scenarios, you just want to validate something that isn’t directly related to WebRTC.
testRTC by default assumes that you want WebRTC sessions, so if it can’t find any WebRTC connection with some audio or video data in it in your test results, it will fail the test automatically, with the error “No WebRTC Data collected” thrown.
To avoid that and remove this status error and check from your test script, you can add the following code into your test script:
// Make sure we don't fail because of no media
client
.rtcSetTestExpectation("video.in == 0")
.rtcSetTestExpectation("audio.in == 0")
.rtcSetTestExpectation("video.out == 0")
.rtcSetTestExpectation("audio.out == 0");
Code language:JavaScript(javascript)
The code snippet above instructs testRTC to ignore cases where no media is sent or received in the test scenario.
Network Jitter or Round Trip Time – which is more important when testing or monitoring a WebRTC application?
You’ve got your WebRTC application. You have users communicating with it. How do you know they are having a good experience? How do you know you’ve placed your servers in the right locations? Got the routes properly configured? Do you need to add a new server in Frankfurt? Or maybe it would be better to beef up your Australian presence?
These answers require looking at the users you have and the quality they are getting. And when the time comes to look at WebRTC quality, you’ll hear a lot the terms network jitter, latency and round trip time thrown around.
So which one is more important to track and focus on with WebRTC? Is it network jitter or maybe it is round trip time?
I’d say both. But not exactly…
Let’s try to break this down to understand it better.
We can look at these metrics, and especially latency and round trip time in different ways, where the first question to ask is what exactly are we measuring?
The illustration above is a simplified version of the network traffic in a WebRTC session. We don’t have servers here and we don’t have a lot of other components. Rest assured that each component along the way can add latency and even affect jitter.
What I did in the illustration is also delineated 3 different areas:
The peripheral, where the media is acquired and played. Screens, microphones, cameras, speakers – they all add inherent delays and some of it can be considerable. Bluetooth devices for example are notorious for adding delays (anyone said iOS 15?)
WebRTC processing, on its own, designed and built to reduce delays and jitter, but a contributor to it as well. This is doubly true in media servers that you own and operate but also true for browsers you don’t control and your users are using to access your service
Network, which is what we’re trying to measure, at least in this article
Here’s the thing: for the most part, in most use cases, you have little control or knowledge of the peripherals being used. Measuring their own effects is also hard and in many real world applications impossible. So we are going to ignore peripherals.
WebRTC processing and the network are usually bunched together and there’s little in the way of splitting them up. Based on what you see and experience, you will need to decide if the issue is the network (=infrastructure and DevOps) or WebRTC processing (=software bugs and optimizations).
Network Jitter vs Round Trip Time (or Latency)
To me, the difference between network latency and round trip time is akin to the difference between weather and climate: Weather reflects short-term conditions of the atmosphere while climate is the average daily weather for an extended period of time at a certain location.
In the same token, jitter reflects short-term conditions or more accurately inconsistencies in the flow of packets over a network while round trip time (or latency) is the average time it takes for packets to flow through the network for a longer period of time and from one location to another.
Network Jitter answers the question how inconsistent the network is.
Round Trip Time (or Latency) answers the question how much delay is there in the network.
What’s “Network Jitter”?
In a WebRTC session, we will be sending over packets continuously. On a voice call, in many cases, a packet will be sent every 20 milliseconds. With video, we will be sending packets to reach 30 frames per second, and there are more than a single packet per frame usually, which means hundreds of packets every second.
Assuming the network experiences no packet loss, then we expect to receive the same number of packets in the same frequency.
Let’s look at a span of 200 milliseconds of audio from a sender’s perspective versus a receiver’s one. That’s 10 packets worth of data:
The sender sends an SRTP audio packet every 20 milliseconds in the illustration above, but the receiver doesn’t receive them exactly every 20 milliseconds – they are somewhat jittery… and that’s what we’re measuring with network jitter.
What contributes to network jitter?
Mainly the network.
When you send packets over the internet, who guarantees that what gets sent is actually received and in a timely manner?
Think about the post office. Not all letters delivered get to their destination, and not all letters delivered get to their destination with the same latency (=on time). The same is true for a computer network, and the more complex the network, the harder it gets to do this properly.
Here are some things that can affect network jitter badly:
The user’s network and his location
Poor location. A user connecting from inside an elevator over cellular or sitting far away from his WiFi access point will result in bursty connections that will introduce high jitter and packet loss
Congested network. Either the local one (your daughter on TikTok and your son on Fortnite while you’re trying to have a conversation over WebRTC; an office with too many people on the Internet on a slow connection; 50,000 people in a stadium trying to do Facebook Live at the same time) or the path to the WebRTC infrastructure being clogged by network traffic
Faulty hardware. A bad ethernet cable… a true story: we had a client some time ago stress testing his service, only to find that packet loss (and jitter) originated from a faulty cable in his data center
CPU. Local resources on a user’s device or your TURN and media servers in itself can add jitter. If the CPU of a machine starts throttling, the end result is going to be jitter (and packet loss)
Things that end up causing jitter on top of just jitter are packet loss (we never did receive what was sent), duplication of packets (yes, that can happen) and reordering of packets (if they are out of order, there’s definitely jitter, just with an added headache).
Why is network jitter a bad thing?
Why is this bad? Because if we want to smoothly playback the audio and video being sent, we need to align it yet again towards what the sender intended. Or more accurately, towards what the microphone and camera captured on the sender side.
If we don’t align the incoming media, the audio will not sound natural and the video will look choppy. If you want to experience this firsthand, just make sure the CPU of the device you are using is busy doing other things while being on a video call.
How does WebRTC compensate for jitter?
This is something that all VoIP services have, which is a jitter buffer. A jitter buffer is a software component that collects the received packets and decides when to play them out. It is used to handle lip synchronization (playing out audio and video together in sync), to reorder packets, and to take into account the jitter on the network.
If we know that jitter can be around 30 milliseconds, then the jitter buffer can wait for at least that time before playing back packets, so that whenever we need to play back a packet in a smooth manner, that packet has already been received.
Since network jitter is dynamic in nature, so is WebRTC’s jitter buffer – it is an adaptive jitter buffer that tries to understand how much jitter there is on the network, and increase or decrease the buffer size (length) based on what the network exhibits. Why do we do that? Because too little jitter means bad user experience due to dropped packets or improper playback and too high a jitter means adding to the latency of the playout, which we don’t want in real time interactive WebRTC sessions.
Do we look at “Latency” or “Round Trip Time”?
Latency, round trip time and delay are words that get dumped together. Also RTT – which is the acronym for round trip time. While there are nuances between them, and what exactly each one means, the lower they are the better the experience will be and the better interactive the session can be.
Here’s how I usually look at these and categorize them:
Latency for me is the time it takes for a packet of data to get from one point in the network to another.
Round trip time is the time it takes for a response packet to get back.
You can argue around latency and delay and decide if they should include or shouldn’t include the peripheral’s built in delay or even the delay added by WebRTC processing in end units or servers in the network.
For round trip time, the argument can be around the processing time needed to handle the incoming message and then send out the reply to it (if don’t incorrectly, this can add a considerable delay on its own).
And how do you measure latency exactly? If the clocks on the two devices aren’t fully in sync, how can you measure it? The result is, that in most cases, and WebRTC is no different, you rely on the round trip time instead – if I send a message and wait for a response, all I need to do is check the time that passed. And that’s exactly what you can glean out of the RTCP reports and WebRTC statistics.
What contributes to round trip time?
Besides the things that affect jitter, you’ll find here also the route taken by the packets over the network.
Here’s how I usually explain it – lets say your TURN server or media server or gateway is located in “East US”. That’s the generic name we all give to our first cloud data center choice.
Why? We want a global service, but we try to target the US first, so it needs to be in the US. And on the maps, the best alternative to also reach Europe is the east coast. So we end up with US East on one of the cloud vendors. At least until we grow and distribute our service.
What happens if the session takes place between 2 people who are both located in Paris and the session is routed through our media servers in the US?
That most probably will take a longer route both geographically and when measured in time, which ends up adding to the latency of the session. In many cases, it also means a higher packet loss as there are more opportunities along that route to lose packets.
This means that the way we design our infrastructure, deploy it around the globe and configure it has a considerable impact on the round trip time users are going to experience.
Why is high round trip time a bad thing?
More latency means it takes time from what we do until the other side can hear or see it.
For live streaming (somewhat related to WebRTC), the effects of latency are simply to explain. Here’s a good video for that:
If you are dealing with surveillance cameras, then latency is bad. When you’re in an interactive session – a 1:1 conversation or a group meeting, then you’ll be expecting latency of below 200 milliseconds. Anything above that would be noticeable and nagging. You won’t know when someone finished speaking so you can contribute to the conversation right after him for example.
So we’d like to have low round trip time as well as low network jitter for a good interactive experience in WebRTC applications.
How does WebRTC compensate for high round trip time?
It doesn’t. Not really. You’re on your own. You’ll need to decide where to place your servers and how to configure the routes between them to reduce latency.
Solutions we’ve seen recently range from:
Placing more media servers and TURN servers in more data centers closer to where your users are
Using third party TURN servers that are highly distributed (think Subspace and Cloudflare)
Go for a service such as AWS Global Accelerator to end up with an optimized route
At the end of the day, you’ll need to invest energy or money or both in order to improve round trip time as you grow your service.
We didn’t talk packet loss
Here’s something you should understand – high round trip time or network jitter can easily cause packet loss.
If there’s congestion on the network, you might end up with packet loss since a network switch or router along the path of your packets decided to drop some of your packets because it is congested.
But if the packets arrive too late (because of high round trip time or high jitter), then playing them might not be an option anymore – their time has passed. In such a case, WebRTC would simply drop the packet even though it received it. The real time nature of WebRTC doesn’t allow it to buffer data forever.
Network jitter and round trip time – are these an infrastructure problem or an end user problem?
Both.
At times, network jitter and round trip time can occur due to infrastructure issues – anything from faulty cables, bad network configurations or just machines that are too busy to process data fast enough.
Other times, your user is to blame. Either due to his device or the network he is using.
Then there’s the network. If everyone is currently trying to access the network, there are bound to be clogged routes, even if only periodically.
It is going to be your job to try and understand where the problem originates from.
How to fix network jitter and round trip time using testRTC’s tools?
Glad you asked 😀
testRTC offers tools for the full life cycle of WebRTC applications. For the most part, fixing jitter and round trip time is going to be part of the operations work on your end – understanding where traffic is routed through and how to redirect it elsewhere (including the possible need to add new regions and servers). Here’s where you’ll meet network jitter and round trip time in our services:
testingRTC
Our WebRTC testing service enables you to conduct integration, regression, function, non-functional, sizing, load and stress testing.
In all tests we collect network jitter and round trip time for all simulated probes in a session. We treat your service as a black box, launch our machines from different locations around the globe (you define which ones) and collect that as part of the metrics we store. We make it available on the channel level, browser level and test level as an aggregate of everything. Access to it is offered via the dashboard and through APIs. You can even add your expectations of these values and cause tests to fail based on your thresholds. If you want, you can dynamically change these values for each browser in the test and see how this affects your service.
upRTC
upRTC is our WebRTC active monitoring service. Its main purpose is to understand the behavior of your infrastructure. It does that by bringing predictability to the user side and his network, so you can be sure that every time the monitor’s browser runs in front of your infrastructure they behave the same from the side of the network.
Here, looking at network jitter and round trip time and setting thresholds for them to alert you via email and webhook is the way to go.
watchRTC
watchRTC offers WebRTC passive monitoring. It hooks up to your users’ devices and collects their WebRTC metrics. This gets processed, aggregated and analyzed. Part of the metrics we collect and share is network jitter and round trip time. We do that on the individual channel level, the peer level, the room level and in aggregate across complex filters:
The purpose of it all is:
To let you understand what your end users are experiencing
Assist you in tracking down outliers in device types, operating systems, networks, locations, etc
Drill down to a certain user’s complaint when needed
qualityRTC and probeRTC
With qualityRTC and probeRTC we help your support and users answer the question “how can I improve my connectivity to your service?”
This is done by a series of tests, many of them collecting network jitter and round trip time data
Talk to us
Need to figure out your network jitter? Have a round trip time and latency issue with users?
Come and talk to us. I am sure we will be able to help you figure out the issues.
By default, watchRTC is an “all you can eat” attitude. This means it will try to collect as much telemetry and metric data from as many useful source as possible. This is great for the most part, but there are times when you wouldn’t want it to collect specific peer connections in your application. This can be when:
The peer connections are used for pre-call tests
These tests are meant to check network connectivity and quality prior to making a call
They are short lived and “noisy” when it comes to WebRTC metrics collection
They offer little value for understanding the user experience of the actual session itself (which is what watchRTC is interested in)
“Unimportant” peer connections
At times, some of the peer connections may be of less value to you but too chatty in nature. This can be for example a peer connection used for signaling on its data channel (though you might want to see it)
Lower priority peer connections, for example small video windows in a cloud gaming platform, where you are most interested in monitoring the peer connection of the game itself
watchRTC SDK offers two different solutions for such cases:
Provide the roomId and peerId only on the RTCPeerConnection call itself, and don’t pass it in the Init() or SetConfig(). This will give you total control over which peer connections to collect and which to ignore
Make use of watchRTC.disableDataCollection() for times when you don’t want to collect telemetry and watchRTC.enableDataCollection() when you do. Remember that by default, the watchRTC SDK collection is enabled
To learn more about the APIs available via the SDK, check out our wacthRTC SDK guide.
watchRTC collects its metric via a secure WebSocket connected to the testRTC backend servers.
The only connection you need to configure on your firewall is for devices reaching out to wss://watchrtc.testrtc.com and/or https://watchrtc.testrt.com – both on port 443.
watchRTC offers 3 alternatives to deal with firewalls:
Configure the firewall by opening connections to wss://watchrtc.testrtc.com and https://watchrtc.testrtc.com on port 443
Set up your own CNAME redirection to watchrtc.testrtc.com (something like watchrtc.<your-domain>.com). This has the added benefit of requiring less configuration for your end users if you are already asking for them to wildcard *.<your-domain>.com for example
Setup a proxy in your own data centers and configure it to our servers. You will also need to configure the SDK by redirecting it to your proxy