Monitoring Vidyo’s WebRTC Infrastructure End-to-End on a Global Scale

Vidyo has been using testRTC for the past two years to monitor its global WebRTC infrastructure end-to-end.

Vidyo offers high quality cloud video conferencing services to its impressive list of customers. There are three main product lines at Vidyo:

  1. VidyoConnect – a managed enterprise meeting solution for team collaboration
  2. VidyoEngage – a live video chat platform for call center customer engagement
  3. Vidyo.io – cloud APIs for embedded video communications in applications

All of these product lines share the same core video platform with WebRTC capabilities.

Vidyo caters large enterprises in mission critical systems, so from the start, it put in place a sophisticated system to monitor its infrastructure and service. That system is built on top of Splunk, where logs from across its system gets aggregated and filtered, letting different types of alerts to bubble up to the relevant teams within Vidyo via PagerDuty or email, depending on the seriousness of the alert.

End-to-End Monitoring

Early on, Vidyo saw the need for an end-to-end monitoring capability within their monitoring system. A way that will simulate real customers from all over the globe and alert of any issues. This is why Vidyo selected to use testRTC.

testRTC enabled Vidyo to create a scenario where testRTC’s probes join calls on any of Vidyo’s cloud products, authenticate with the service, join a meeting room, send and receive voice and video data in real time.

While Vidyo monitored its different machines and subsystems already, adding testRTC meant it was capable of monitoring the service as experienced by real users, doing it with predictability over the scenario used and at scale.

Integrating with an existing monitoring system

Vidyo wanted to collect and push monitor run results from testRTC into its Splunk big data repository of machine data. Run results from testRTC are automatically inserted into Vidyo’s Splunk repository using testRTC’s webhook mechanism.

Collecting that data gave Vidyo the power to finetune the feedback it received from testRTC, deciding if a failure is of a low priority, occurring randomly or of high priority, such as a failure occurring across monitors in a short period of time.

A global infrastructure

Every data center that Vidyo operates from gets its own special treatment. For each of the product lines hosted within that data center, Vidyo has a running testRTC monitor for.

Each monitor makes use of probes running independently from different locations worldwide, which adds another layer of monitoring to the solution – testRTC is capable of checking different routes and behaviors, with the intent to catch network issues as early as possible as well.

Whenever a new data center opens up, or a new geography needs to be served, Vidyo is able to modify an existing monitor or create a new testRTC monitor to cover that location.

It just works

testRTC runs continuously and relentlessly, connecting calls via Vidyo’s platform. It does so in a predictable fashion, collecting all logs along the way. Vidyo have learned to see the value in such an approach – random failures can be debugged in post mortem, finding their root causes and assisting in finding bugs and points of failures in the system.

“testRTC is a key component in Vidyo’s monitoring system. Digging down to the root cause is part of the work culture at Vidyo, and using testRTC we have eyes on the system 24×7 and can investigate issues thoroughly ensuring operational excellence for the benefit of our customers. ”

Nahum Cohen, SVP, Service and Operations @ Vidyo

Using testRTC, Vidyo are able to find issues with data centers, networks and their platform before customers notice it, giving them the needed time to resolve these issues.

Moving Forward with testRTC

Vidyo is in the process of introducing testRTC’s monitors to additional data centers it is currently operating, making sure its service is monitored end-to-end for all of its locations.