Tag Archives for " production "

Monitoring WebRTC apps just got a lot more powerful

As we head into 2019, I noticed that we haven’t published much around here. We doubled down on helping our customers (and doing some case studies with them) and on polishing our service.

In the recent round of updates, we added 3 very powerful capabilities to testRTC that can be used in both monitoring and testing, but make a lot of sense for our monitoring customers. How do I know that? Because the requests for these features came from our customers.

Here’s what got added in this round:

1. HAR files support

HAR stands for HTTP Archive. It is a file format that browsers and certain viewer apps support. When your web application gets loaded by a browser, all network activity gets logged by the browser and can be collected by a HAR file that can later be retrieved and viewed.

Our focus has always been WebRTC, so collecting network traffic information that isn’t directly WebRTC wasn’t on our minds. This changed once customers approached us asking for assistance with sporadic failures that were hard to reproduce and hard to debug.

In one case, a customer knew there’s a 502 failure due to the failure screenshot we generate, but it wasn’t that easy to know which of his servers and services was the one causing it. Since the failure is sporadic and isn’t consistent, he couldn’t get to the bottom of it. By using the HAR files we can collect in his monitor, the moment this happens again, he will have all the network traces for that 502, making it easier to catch.

Here’s how to enable it on your tests/monitors:

Go to the test editor, and add to the run options the term #har-file


Once there and the test/monitor runs next, it will create a new file that can be found under the Logs tab of the test results for each probe:

We don’t handle visualization for HAR files for the moment, but you can download the file and place it on a visual tool.

I use netlog-viewer.

Here’s what I got for appr.tc:

2. Retry mechanism

There are times when tests just fail with no good reason. This is doubly true for automating web UI, where minor time differences may cause problems or when user behavior is just different than an automated machine. A good example is a person who couldn’t login – usually, he will simply retry.

When running a monitor, you don’t want these nagging failures to bog you down. What you are most interested in isn’t bug squashing (at least not everyone) it is uptime and quality of service. Towards that goal, we’ve added another run option – #try

If you add this run option to your monitor, with a number next to it, that monitor will retry the test a few more times before reporting a failure. #try:3 for example, will retry twice the same script before reporting a failure.

What you’ll get in your monitor might be something similar to this:

The test reports a success, and the reason indicates a few times where it got retried.

3. Scoring of monitor runs

We’ve started to add a scoring system to our tests. This feature is still open only to select customers (want to join in on the fun? Contact us)

This scoring system places a test based on its media metrics collected on a scale of 0-10. We decided not to go for the traditional MOS scoring of 1-5 because of various reasons:

  1. MOS scoring is usually done for voice, and we want to score video
  2. We score the whole tests and not only a single channel
  3. MOS is rather subjective, and while we are too, we didn’t want to get into the conversation of “is 3.2 a good result or a bad result?”

The idea behind our scores is not to look at the value as good or bad (we can’t tell either) but rather look at the difference between the value across probes or across runs.

Two examples of where it is useful:

  1. You want to run a large stress test. Baseline it with 1-2 probes. See the score value. Now run with 100 or 1000 probes. Check the score value. Did it drop?
  2. You are running a monitor. Did today’s runs fair better than yesterday’s runs? Worse? The same?

What we did in this release was add the score value to the webhook. This means you can now run your monitors and collect the media quality scores we create and then trendline them in your own monitoring service – splunk, elastic search, datadog, whatever.

Here’s how the webhook looks like now:

The rank field in the webhook indicates the media score of this session. In this case, it is an AppRTC test that was forced to run on simulated 3G and poor 4G networks for the users.

As with any release, a lot more got squeezed into the release. These are just the ones I wanted to share here this time.

If you are interested in a monitoring service that provides predictable synthetics WebRTC clients to run against your service, checking for uptime and quality – check us out.

How Clique Migrated Smoothly to the Newest AWS EC2 C5 Instance

In a need to focus resources on core activities, Clique Communications turned to testRTC for stress testing and sizing.

Clique API provides web-based voice and text application programming interfaces. In their eight years of existence, they have grown to support over 20 million users across 150 countries. This amounts to over 500 million minutes per month. Clique’s cloud services deliver multi-party voice that can be embedded by enterprises  into their own business processes.

What are their current goals?

  • Grow the business
  • Add features to improve customer service and experience
  • Offer value-added services

Adding WebRTC

Clique started working with WebRTC some 18 months ago with customers starting to use it at the end of 2017.

Today, Clique supports all major browsers – Chrome, Firefox, Safari, and Edge; enabling its customers to offer uninterrupted interactions with their users. When a user joins a conference, he/she can do so over PSTN, directly from the browser or from within a native application that utilizes Clique SDKs.

Making Use of testRTC

As with any other software product, Clique had to test and validate its solution. To that end, Clique had already been using  tools for handling call volumes and regression, testing the application and the SDKs. The challenge was the issue of scalability and quality of service, which is essential  when it comes to WebRTC support. Clique had a decision to make – either invest in building their own set of testing tools on top of open source frameworks such as Selenium, or opt for a commercial alternative. They decided to go with the latter and use testRTC. They also preferred using a third party tool for testing as they didn’t want to burden their engineering team.

Switching from AWS EC2 C4 to C5

Clique had previously used a standard instance on Amazon from the AWS EC2 C4 series but when the AWS EC2 C5 series came out, they wanted to take advantage of it – not only was it more economical but it also had better performance. Furthermore, knowing Amazon would release newer sets of servers that would need to be tested again, Clique required this process to be repeatable.

The Action Plan

Since Clique is an embeddable service, they decided it was most strategic to have a third party develop an application using the Clique client SDK and APIs, and use that application as a test framework that could scale and grow the performance of the platform. It was a wonderful opportunity to optimize their own resources and save on the instances that they deploy on Amazon. An added bonus was having a third party that could then be used by Clique’s customers and partners who are building applications hey can use as part of their development process.

Clique wrote  their own test scripts in testRTC. The main test scenario for Clique was having a moderator who creates the conference and then generates a URL for other participants in the conference to join. Once they figured out how to do that with testRTC, the rest was a piece of cake.

Using testRTC to assist in sizing the instances on the AWS has ancillary benefits beyond Clique’s core objectives. Clique tested the full life-cycle of its solution. From developing yet another application with its SDKs, integrating its APIs, to continuous integration & devops, Clique discovered  bugs that were then fixed, optimized performance and gave Clique confidence to run services at scale in next generation architectures.

“testRTC provided Clique with a reliable and repeatable mechanism to measure our CPaaS performance… allowing Clique to save money, remain confident in our architectural choices and more importantly showcase our platform to customers with the integrity of an independent test system.”

Moving Forward – Continued use of testRTC

There are a lot of moving pieces in Clique’s solution:  infrastructure in the backend, media servers , the WebRTC gateways. Features such as recording can fit into various components within the architecture, and Clique is always looking for ways to optimize and simplify.

testRTC helps Clique evaluate if the assumptions made in their architecture are valid by determining bottlenecks and identifying places of consolidation.

In the future, Clique will be looking at testRTC’s monitoring capability as well as using testRTC to instantiate browsers in different locations.


WebRTC Application Monitoring: Do you Wipe or Wash?

UPDATE: Recording of this webinar can be found here.

If you are running an application then you are most probably monitoring it already.

You’ve got New Relic, Datadog or some other cloud service or on premise monitoring setup handling your APM (Application Performance Management).

What does that mean exactly with WebRTC?

If we do the math, you’ve got the following servers to worry about:

  • STUN/TURN servers, deployed in one or more (probably more) data centers
  • Signaling server, at least one. Maybe more when you scale the service up
  • Web server, where you actually host your application and its HTML pages
  • Media servers, optionally, you’ll have media servers to handle recording or group calls (look at our Kurento sizing article for some examples)
  • Database, while you might not have this, most services do, so that’s another set of headaches
  • Load balancers, distributed memory datagrid (call this redis), etc.

Lots and lots of servers in that backend of yours. I like to think of them as moving parts. Every additional server that you add. Every new type of server you introduce. It adds a moving part. Another system that can fail. Another system that needs to be maintained and monitored.

WebRTC is a very generous technology when it comes to the variety of servers it needs to run in production.

Assuming you’re doing application monitoring on these servers, you are collecting all machine characteristics. CPU use, bandwidth, memory, storage. For the various servers you can go further and collect specific application metrics.

Is that enough? Aren’t you missing something?

Here are 4 quick stories we’ve heard in the last year.

#1 – That Video Chat Feature? It Is Broken

We’re still figuring out this whole embeddable communications trend. The idea of companies taking WebRTC and shoving voice and video calling capabilities into an existing product and workflow. It can be project management tools, doctor visitations, meeting scheduler, etc.

In some cases, the interactions via WebRTC are an experiment of sorts. A decision to attempt embedding communications directly to the existing product instead of having users find how to communicate directly (phone calls and Skype were the most common alternatives).

Treated as an experiment, such integrations sometimes were taken somewhat out of focus, and the development teams rushed to handle other tasks within the core product, as so often happens.

In one such case, the company used a CPaaS vendor to get that capability integrated with their service, so they didn’t think much about monitoring it.

At least not until they found out one day that their video meetings feature was malfunctioning for over two weeks (!). Customers tried using it and failed and just moved on, until someone complained loud enough.

The problem ended up being the use of deprecated CPaaS SDK that had to be upgraded and wasn’t.

#2 – But Our Service is Working. Just not the Web Calling Part

In many cases, there’s an existing communication product that does most of its “dealings” over PSTN and regular phone numbers. Then one day, someone decides to add browser dialing. Next thing that happens, you’ve got a core product doing communications with a new WebRTC-based feature in there.

Things are great and calls are being made. Until one day a customer calls to complain. He embedded a call button to his website, but people stopped calling him from the site. This has gone for a couple of days while he tried tweaking his business and trying to figure out what’s wrong. Until finding out that the click to call button on the website just doesn’t work anymore.

Again, all the monitoring and health check metrics were fine, but the integration point of WebRTC to the rest of the system was somewhat lost.

The challenge here was that this got caught by a customer who was paying for the service. What the company wanted to do at that point is to make sure this doesn’t repeat itself. They wanted to know about their integration issues before their customers do.

#3 – Where’s My Database When I Need it?

Here’s another one. A customer of ours has this hosted unified communications service that runs from the browser. You login with your credentials, see a contacts list and can dial anyone or receive calls right inside the browser.

They decided to create a monitor with us that runs at a low frequency doing the exact same thing: two people logging in, one calls and the other answers. Checking that there’s audio and video and all is well.

One time they contacted us complaining that our monitor is  failing while they know their system is up and running. So we opened up a failed monitor run, looked at the screenshot we collect automatically upon failure and saw an error on the screen – the browser just couldn’t get the address book of the user after logging in.

This had nothing to do with WebRTC. It was a faulty connection to the database, but it ended up killing the service. They got that pinpointed and resolved after a couple of iterations. For them, it was all about the end-to-end experience and making sure it works properly.

#4 – The Doctor Won’t See You Now

Healthcare is another interesting area for us. We’ve got customers in this space doing both testing and monitoring. The interesting thing about healthcare is that doctor visitations aren’t a 24/7 thing. For that particular customer it was a 3-hour day shift.

The service was operating outside of the normal working hours of the doctor’s office, with the idea of offering patients a way to get a doctor during the evening hours.

With a service running only part of the day, the company wanted to be certain that the service is up and running properly – and know about it as early on as possible to be able to resolve any issues prior to the doctors starting their shift.

End-to-End Monitoring to the Rescue

In all of these cases, the servers were up and running. The machines were humming along, but the service itself was broken. Why? Because application metrics tell a story, but not the whole story. For that, you need end-to-end monitoring. You need a way to run a real session through the system to validate that all of its pieces – all of its moving parts – are working well TOGETHER.

Next week, we will be hosting a webinar. In this webinar, we will show step by step how you can create a killer monitor for your own WebRTC application.

Oh – and we won’t only focus on working/not working type of scenarios. We will show you how to catch quality degradation issues of your service.

I’ll be doing it live, giving some tips and spending time explaining how our customers use our WebRTC monitoring service today – what types of problems are they solving with it.

Join me:

Creating a Kickass WebRTC Monitor Using testRTC
recording can be found here



How to Prepare Your WebRTC Application for a Surge in Traffic

OK, this is the moment you’ve been waiting for: there’s a huge surge in traffic on your WebRTC application. Success! You even had the prescience to place all of your web application’s assets on a CDN and whatever uptime monitoring service you use, be it New Relic, Datadog or a homegrown Nagios solution – says all is fine. But there’s just one nagging problem – users are complaining. More than they used to. Either because the service doesn’t work at all for them or the quality of the media just doesn’t cut it for them. What The–?!

We recently hosted a webinar about preparing for that big WebRTC launch. You might want to check the suggestions we made there as well.

Register now for free: WebRTC – How NOT to Fail in Your Big Launch

Let’s start by focusing on the positives here. Your service is being used be people. Then again, these people aren’t getting the real deal – the quality they are experiencing isn’t top notch. What they are experiencing is inability to join sessions, low bitrates or inexplicable packet losses. These are different than your run of the mill 500 and 502 errors, and you might not even notice something is wrong until a user complains.

So, what now?

Here’s what I’m going to cover today:

  1. Learn how to predict service hiccups
  2. Prepare your WebRTC application in advance for growth

Learn How to Predict Service Hiccups

While lots of users is probably what you are aiming for in your business, the effects they can have on your WebRTC application if unprepared for it can be devastating. Sure, sometimes they’ll force your service to go offline completely, but in many other times, the service will keep on running but it will deliver bad user experience. This can manifest itself by having users wait for long times to connect, requiring them to refresh the page to connect or just having poor audio and video quality.

Once you get to that point, it is hard to know what to do:

  • Do you throw more machines on the problem?
  • Do you need to check your network connections?
  • How do you find the affected users and tell them things have been sorted out?

This mess is going to take up a lot of your time and attention to resolve.

Here is something you can do to try and predict when these hiccups are about to hit you:

Establish a Baseline

I’ve said it before and I’ll say it again. You need to understand the performance metrics of your WebRTC service. In order to do that, the best thing is to run it a bit with the acceptable load that you have and writing down for yourself the important metrics.

A few that come to mind:

  • Bitrate of the channels
  • Average packet loss
  • Jitter

Now that you have your baseline, take the time to gauge what exactly your WebRTC application is capable of doing in terms of traffic. How much load can it carry as you stack up more users?

One neat trick you can do is place a testRTC monitor and use rtcSetTestExpectation() to indicate the thresholds you’ve selected for your baseline. Things like “I don’t expect more than 0.5% packet loss on average” or “average bitrate must be above 500kbps”. The moment these thresholds are breached – you’ll get notified and able to see if this is caused by growth in your traffic, changes in usage behavior, etc.

Prepare Your WebRTC Application in Advance for Growth

There aren’t always warning signs that let you know when a rampaging horde of users may come at your door. And even when there is, you better have some kind of a solution in place and be prepared to react. This preparation can be just knowing your numbers and have a resolution plan in place that you can roll out or it can be an automated solution that doesn’t require any additional effort on your end.

To get there, here are some suggestions I have for you.

Find Your System’s Limits

In general, there are 3 main limits to look at:

  1. How big can a single session grow?
  2. How many users can I cram into a single server?
  3. How many users can my service serve concurrently?

You can read more on strategies and planning for stress testing and sizing WebRTC services. I want to briefly touch these limits though.

1. How big can a single session grow?

Being able to handle 500 1:1 sessions doesn’t always scale to 100 groups of 10 users sessions. The growth isn’t linear in nature. On top of it, the end result might choke your server or just provide bitrates that are just too low above a certain number of users.

Make sure you know what’s the biggest session size you are willing to run.

Besides doing automated testing and checking the metrics against the baseline you want, you can always run an automated test using testRTC and at the same time join from your own browser to get a real feeling of what’s going on. Doing that will add the human factor into the mix.

2. How many users can I cram into a single server?

Most sizing testing are about understanding how many sessions/users/whatever can you fit in a single server. Once you hit that number, you should probably launch another server and use a load balancer to scale out.

Getting that number figured out based on your specific scenario and infrastructure is important.

3. How many users can my service serve concurrently?

Now that you know how to scale out from a single server, growing can be thought of as linearly (up to a point). So it is probably time to put in place automatic scale out and scale down and test that this thing works nicely.

Doing so will greatly reduce the potential and destruction that a service hiccup can cause.

CDN and Caching

Make sure all of the HTML assets of your WebRTC application that static are served through a CDN.

In some cases, when we stress test services, just putting 200 browsers in front of an HTML page that serves a WebRTC application can cause intermittent failures in loading the pages. That’s because the web serving part of the application is often neglected by WebRTC developers who are focusing their time and energy on the bigger resource hogs.

We’ve had numerous cases where the first roadblock we’ve hit with a customer was him forgetting to place a minor javascript file in the CDN.

Don’t be that person.

Geographically Distributed Deployment

The web and WebRTC are global, but traffic is local.

You don’t want to send users to the other side of the globe unnecessarily in your service. You want your media and NAT traversal servers to be as close to the users as possible. This gives you the flexibility of optimizing the backend network when needed.

Make sure your deployment is distributed along multiple datacenters, and that the users are routed to the correct one.

Philipp Hancke wrote how they do it at appear.in for their TURN servers.

Monitor Everything

CPU. Memory. Storage. Network. The works.

Add application metrics you collect from your servers on top of it.

And then add a testRTC monitor to check media quality end-to-end to make sure everything run consistently.

Check across large time spans if there’s an improvement or degradation of your service quality.

Stress Testing

Check your system for the load you expect to be able to handle.

Do it whenever you upgrade pieces of your backend, as minor changes there may cause large changes in performance.

Don’t Let Things Out of Your Control

WebRTC has a lot of moving parts in it so deploying it isn’t as easy as putting up a WordPress site. You should be prepared for that surge in traffic, and that means:

  1. Understanding the baseline quality of your service
  2. Knowing where you stand with your sizing and scale out strategy
  3. Monitoring your service quality so you can react before customers start complaining

Do it on your own. Use testRTC. Use whatever other tool there is at your disposal.

Just make sure you take this seriously.

We recently hosted a webinar about preparing for that big WebRTC launch. You might want to check the suggestions we made there as well.

Register now for free: WebRTC – How NOT to Fail in Your Big Launch


The 4 Techniques of Monitoring WebRTC Services

I remember that first time our servers went down after we had a couple of paying customers.

We got a call from a customer once. The only thing he wanted was to use our monitoring service. Since I knew him before, and knew he wasn’t interested in our monitoring – I asked him why.

I got something similar to this answer:

“We have monitoring on everything. We monitor the machine’s CPU, memory, storage. We look at the network. We collect metrics from our apps and monitor these as well. But yesterday we had a downtime of our service and we didn’t know it until a customer complained.”

Which brings me to the point – with WebRTC, it is extremely important to use end-to-end monitoring. It is also extremely important that this monitoring thingy you are putting in place knows a thing or two about WebRTC, otherwise, how will you know if the customer is really getting that video call or just looking at a blank screen?

Great. So now that we know we have a problem what’s the solution?

Luckily (or not?), there’s more than one way to handle monitoring WebRTC services. I like characterizing the solution based on 2 parameters, making for a nice quadrants to visualize it:

I’ll be using the terms active and passive here to describe the probing technique in a way that might be somewhat confusing to some, but for me this works.

Active monitoring is a system which actively generates traffic in the monitored product, using the generated traffic and the product’s behavior to determine its health.

Passive monitoring is a system which passively collects metrics off the different product components, determining from that the product’s health.

The exact definition/architecture of what is Cloud / SaaS versus what is on premise on premise for me ends up depending on what probing probing technique you refer to – active or passive monitoring. Let’s see how they compare (and along the way explain what cloud and on premise is in each case).

#1 – Active Monitoring (Cloud / SaaS)

Active monitoring is for us the most popular monitoring service that our customers subscribe to.

The way such a monitor works?

  • It has a specific scenario it executes
  • It runs it at a given frequency
  • It validates a certain set of expectations, deciding if there were any failures requiring raising an alert

The WebRTC monitoring frequency pyramid above shows the various frequencies such a monitor can employ.

A daily monitor is akin to a ping – a healthcheck placed on a demo system for example; while a 1-minute monitor is mission critical – it is there to find issues and alert about them as soon as possible and before your customers notice them.

The cloud part of the active monitor is about the machines used to run your service. You deploy them in the cloud, probably on a managed monitoring service (we’ve got one for you). It means less setup hassle and also the ability to decide the geographical location of these machines.

Why use active monitoring?

  1. When your service runs at specific hours of the day. Contact centers for example, or doctor appointments. They tend to have their own “opening hours”, but what happens when the system breaks outside of opening hours? When do you get notified it? When the first customer complains at the beginning of the shift? Or 5 hours earlier when you get an alert from an active monitoring system? In order to get alerts ahead of time here, you need a “non-user” to join the session
  2. When the failure occurs before WebRTC altogether. Sure you have a great way to monitor calls that happen to interact with the WebRTC APIs. But what if the service failure occurs earlier? Like a connection error between your web server and the directory service? An active monitor that runs end-to-end can find and pinpoint such issues
  3. Consistency. Passive monitors show the experience of your users. But it can’t reproduce the same settings to show you if and how you improved – and it is devilishly hard to decide if the problem is a user problem or a service problem. An active monitor can be configured to run in very specific network configurations – over and over again. Its results can be compared in certain timeframes to show the objective degradation or improvement of the service
  4. Zero instrumentation. Nothing needs to change in your service to accommodate for active monitoring. The active probes that will interact with your service accommodate themselves to whatever you are doing today

Not all is rosy here though. To setup a good active monitor you need to plan a use case that fits nicely. One in which the UI of your service is predictable and simple enough to automate. I’ve seen a couple of times instances where monitors failed due to inconsistencies in the UI which caused service failures – things that humans would be comfortable with but automation would not be.

#2 – Active Monitoring (On premise)

An On premise active monitoring solution is similar to a cloud based active monitoring solution with one minor difference: the probes that are used are deployed “on premise” as opposed to “in the cloud”.

What does it mean exactly?

For an education service, where teachers and students can be anywhere, a cloud based approach works great. It actually mimics how the service is used “live”. So having the probes deployed strategically across the globe in different locations makes a lot of sense.

But for a contact center for example, where the agent sits inside the office, you sometimes want to have a monitor on site – a machine dedicated to monitoring also the network constraints that your agents feel – placing the machine within the same subnet on your local LAN.

So, the difference between Cloud and On premise Active Monitoring in WebRTC?

To sum things up – you deploy the probes on premise or in the cloud, but collecting and analysis can happen in both approaches in the cloud. Oh, and obviously, you can also end up deploying some probes on premise and others in the cloud (especially for a call center scenario).

The advantages of the on premise approach is that you get closer to real life scenarios with it for the use cases where you can place your users at a given location.

The main disadvantage is that this is usually a bit more expensive and time consuming to setup and maintain (there’s less of an option to use economies of scale fairy dust for it).

#3 – Passive Monitoring (Cloud / SaaS)

With passive monitoring, there are no real probes. We treat each and every user who interacts with the WebRTC service as a “probe for hire”, available if and when he decides to interact with the service.

In its Cloud variant, the data pulled off from the device gets shipped to the cloud to a third party service who aggregates and analyzes the metrics available in WebRTC (usually by means of getstats calls).

The advantages of this approach is that it gives you the data and analysis on your real user’s interactions. You can’t get any closer to that when it comes to reality. It is also easy to setup and get started with.

There are certain disadvantages though:

  1. Uptime. There is no indication of uptime here. If no users call the doctor before 8am, then you get no data for the time the system is idle – and no visibility towards its health
  2. Predictability. A session may experience failures or issues that relate to the user’s device or network. You will definitely want to optimize your service as much as possible for such cases as well, but it will be hard to check for objective trends of the service’ quality in such a way
  3. Privacy. You send the metrics about your service’ real live traffic to a third party, who can easily discern the size of your operation
  4. Instrumentation. You need to modify your product’s code to integrate with a passive monitoring solution. This will typically be a minor hindrance, but will be there

#4 – Passive Monitoring (On premise)

In many cases, people end up using homegrown passive monitoring systems.

What they do is collect data off the devices and then aggregate and analyze it in their own backend monitoring system. Terms like Elastic Search and Kibana and Graylog get thrown into the air – or god forbid – Big Data.

The biggest advantage here? You collect, get and analyze exactly what you want to. Oh – and you can also easily enrich that information you collect with your business logic and other metrics unrelated to WebRTC. In many cases, this is the reason I’ve seen vendors foregoing the cloud based passive monitoring approach – the need for enrichment and wider analysis.

The big disadvantage here is probably time and material. Putting such an operation in place can be time consuming and expensive. It requires developers to work on your monitoring infrastructure which no one sees at the end of the day instead of having them focus on your core product’s offering and features.

We’re in the process of running a pilot with an on premise passive monitoring product. If you want to learn more, just contact us.

Which shall it be?

Passive or active. Cloud or on premise.

If you are serious with what you are doing, and want to run it as a business – a viable commercial service – then you will need monitoring.

I urge you not to be happy enough with web based monitoring solutions and also go for an end-to-end type of a monitoring service that understands WebRTC.