Tag Archives for " debugging "

WebRTC: To Mechanical Turk or NOT to Mechanical Turk

I’ve seen this a few times already. People look at an automated process – only to replace it with a human one. For some reason, there’s a belief that humans are better. And grinding the same thing over and over and over and over and over again.

They’re not. And there’s a place for both humans and machines in WebRTC product testing.

WebRTC, Mechanical Turk and the lack of consistency

The Amazon Mechanical Turk is a great example. You can easily take a task, split it between many people, and have them do it for you. Say you have a list of a million songs and you wish to categorize them by genre. You can get 10,000 people in Amazon Mechanical Turk to do 100 lines each from that list and you’re done. Heck, you can have each to 300 lines and for each line (now with 3 scores), take the most common Genre defined by the people who classified it.

Which brings us to the problem. Humans are finicky creatures. Two people don’t have the same worldview, and will give different Genre indication to the same song. Even worse, the same person will give a different Genre to the same song if enough time passes (enough time can be a couple of minutes). Which is why we decided to show 3 people the same song to begin with – so we get some conformity in the decision we end up with on the Genre.

Which brings us to testing WebRTC products. And how should we approach it.

Here’s a quick example I gleaned from the great discuss-webrtc mailing list:

discuss-webrtc bug report

There’s nothing wrong with this question. It is a valid one, but I am not sure there’s enough information to work off this one:

  • What “regardless of the amount of bandwidth” is exactly?
  • Was this sent over the network or only done locally?
  • What resolution and frame rate are we talking about?
  • Might there be some packet loss causing it?
  • How easy is it to reproduce?

I used to manage the development of VoIP products. One thing we were always challenged by is the amount of information provided by the testing team in their bug reports. Sometimes, there wasn’t enough information to understand what was done. Other times, we had so many unnecessary logs that you either didn’t find what was needed or felt for the poor tester who spent so much time collecting this stuff together for you with no real need.

The Tester/Developer grind cycle

Then there’s that grind:

Test-Dev grind cycle

We’ve all been there. A tester finds what he believes is a bug. He files it in the bug tracking system. The developer can’t reproduce the bug, or needs more information, so the cycle starts. Once the developer fixes something, the tester needs to check that fix. And then another cycle starts.

The problem with these cycles is that the tester who runs the scenario (and the developer who does the same) are humans. Which makes it hard for repeated runs of the same scenario to end up the same.

When it comes to WebRTC, this is doubly so. There are just too many aspects that are going to affect how the test scenario will be affected:

  • The human tester
  • The machine used during the test
  • Other processes running on said machine
  • Other browser tabs being used
  • How the network behaves during the test

It is not that you don’t want to test in these conditions – it is that you want to be able to repeat them to be able to fix them.

My suggestion? Mix and match

Take a few cases that goes through the fundamental flows of your service. Automate that part of your testing. Don’t use some WebRTC Mechanical Turk in places where it brings you more grief than value.

Augment it with human testers. Ones that will be in charge of giving the final verdict on the automated tests AND run around with their own scenarios on your system.

It will give you the best of both worlds, and with time, you will be able to automate more use cases – covering regression, stress testing, etc.

I like to think of testRTC as the Test Engineer’s best companion – we’re not here to replace him – just to make him smarter and better at his job.

WebRTC Test Automation and where it fits in your roadmap

I see mixed signals about the popularity and acceptance of test automation. It is doubly so when testing WebRTC.

Time to consider some serious WebRTC test automation.

In favor of automation

A tester automated his job for 6 years – most probably a hoax, but one that rings partially true. The moral of the story is simple – if you invest time in automating rudimentary tasks – you get your ROI back tenfold in the future.

That’s… about it.

We have customers who use us to automate areas of their testing, but not many. At least not as many as I’d expect there to be – WebRTC being new and all – and us looking at best practices and changing our bad ways and habits of the past when stating with green field projects.

Against automation

Why is Manual QA Still So Prevalent? – it seems like SauceLabs, who delve into general purpose browser automation, is also experiencing the same thing. Having companies focus on manual testing instead of moving to automation.

Best explanation I heard from someone? They can get a cheap tester to do the work for them by outsourcing it to a developing country and then it costs them less to do the same – just with humans.

For me, that’s taking Amazon’s Mechanical Turk a step too much. For a repetitive task that you’re going to do in each and every release (yours and of browser vendors), to have different nameless faces (or even named ones) do the same tasks over and over again?

Dog-fooding at testRTC

We’ve been around for almost 2 years now. So it is high time we start automating our own testing as well.

The first place where we will be automating our own testing is in making sure our test related feature set works:

  • Our special script commands and variables
  • Running common test scenarios that our customers use in WebRTC

Now, we have test scripts that run these tests, so we can automate them individually. Next step would be to run them sequentially with a “click of a button”. Or more accurately, an execution of a shell script. Which is where we’re taking this in our next release.

The rest will stay manual for now. Mostly because in each version we change our UI based on the feedback we receive. One of our top priorities is to make our product stupidly simple – so that our customers can focus on their own product and need to learn as little as possible (or nothing at all) to use testRTC.

Why our customers end up automating?

There are several huge benefits in automating at least parts of your testing. Here are the ones we see every day from the way our customers make use of WebRTC:

  • Doing the most basic sanity tests – answering the question “is it broken?” and getting an answer fast with no human intervention. This is usually coupled with continuous integration, where every night the latest build is tested against it
  • Scale tests – when a service needs to grow, be it to 10 users in the same session, 40 people across 20 1:1 sessions or 100 viewers of a webinar – it becomes hard to manually test. So they end up writing a simple script in our platform and running it on demand when the time comes to stress test their product
  • Network configurations – taking a script and running it in various network conditions – with and without forcing TURN, packet losses, etc. Some also add different data center locations for the browsers and play with the browser versions used. The idea is to get testing to the edge cases where a user’s configuration is what comes back to bite you
  • Debugging performance – similar to scale tests, but slightly different. Some require the ability to check the capacity of a given machine in their product. Usually the media server. There’s much to be said about that, but being able to run a large scale test, analyze the performance report testRTC produces, and then rinse and repeat means it is easier to find the bottlenecks in the system and fix them prior to deployment

Starting out with WebRTC, we’ve seen other things getting higher priority by customers. They all talk about scenarios and coverage of their test plans. Most don’t go there due to that initial high investment.

What we do see, and what effectively improves our customer’s product, is taking one scenario. Usually a simple one. Writing it in a way that allows for scaling it up. Once a customer runs it for a few days, he sees areas he needs to improve in his product, and how that simple script can expand to encompass more of his testing needs.

This is also why we try to be there with our customers every step of the way. From assisting in defining that test, to writing it and following through with analysis if need be.

Are you serious about your WebRTC product? Don’t waste your time and try us out.