TestToTester

Oracle deficiency?


We were in the middle of a release. It was late and we had just found a bug, a show stopper for a feature. Ben then said I always wonder - what separates a good tester from an average tester? at most times I feel it's their ability to spot the important bugs at the right time. 

These words have struck with me ever since. And this happened at another site.

I observed "attack-name": "HEADER_COUNT_EXCEEDED" in an API response when I used postman to poke some of the APIs built. The error was not consistent but I just did not feel right about this. Checked with the team (lets call them teamA) testing the APIs and it appears they had never seen it before. So I checked with another team (lets call this teamB) who were consuming the APIs but building something different. It was the same response. I was aware that both teams used J-meter to test APIs and I was poking it via postman. Could that be a problem?

I built a small collection to iterate the API calls to check if its reproducible. I had 6 failures of the 10 calls. When I shared this observation with TeamA again. The recommendation was to use J-meter instead of Postman! Really? Am I the only one seeing a problem here?

Pinged my developer and asked if she was willing to pair/investigate and her reply was exactly what I was looking for - oh yea one of our remote devs spotted this yesterday but because it was not consistent we thought it was a transient issue. I love such words 'transient', 'buggered', etc it always to me mean there is something which cannot be explained logically and so could be a potential bug.

I showed her my collection and she was convinced to dig in. 15 min into tracing the logs we found out that there is a new rule set on WAF to block requests with more than X headers. Postman with accept version was sending X+1 headers and without the accept version, it was falling just under the WAF configured rule (X). This also helped us understand why J-meter never picked it up. From J-Meter, there was only X-2 headers sent, including the accept-version.

When we took this to the architect we learnt that a lot of headers passed are informational, but not essential. So a decision was made to strip out headers that are not required (except for what is required by WAF, audit and logging)

This observation could had been a potential show stopper!

Looking back at this I understand why the two teams did not spot it. It could very much be down to both using only J-meter to test APIs. But what really surprised me was the complete lack of - could that be a problem? This makes me wonder if its because of the lack of oracles?

Another learning for me from this was that - this bug could have not been picked at teams/companies who thrive for uniformity in their toolset, frameworks, processes, etc

Exploring Hypothesis with pytest - Part 1

I found Hypothesis library very intriguing so have started exploring it.  Here are my notes from the first hypothesis exploratory session

What is Hypothesis?
From their webpage - Hypothesis is a Python library for creating unit tests which are simpler to write and more powerful when run, finding edge cases in your code you wouldn’t have thought to look for. It is stable, powerful and easy to add to any existing test suite.
You can read more about it and its claims here -> https://hypothesis.readthedocs.io/en/latest/index.html

I am definitely not a fan of tall claims like - "finding edge cases in your code you wouldn’t have thought to look for"! Anyways I will come to that later on.

For now I wanted to explore its ability to add to any existing test suite. So I picked pytest

What is pytest?
Again from their webpage - The pytest framework makes it easy to write small tests, yet scales to support complex functional testing for applications and libraries.
More here -> https://docs.pytest.org/en/latest/

Now I wanted a program to explore this unit test library. So I picked the first practise python from https://www.practicepython.org/exercise/2014/01/29/01-character-input.html

Here is my program for
#Create a program that asks the user to enter their name and their age. Print out a message addressed to them that tells them the year that they will turn 100 years old.

Filename - whenYouAre100.py

Then I started adding some unit tests in pytest for the code. I started with the below checks
  • What happens if I pass 100? expected = 2019
  • What happens if I pass 0? expected = 2119
  • What happens if I pass 101? expected = 2018
  • What happens if I pass a random number like 50? expected = 2069

Filename - test_whenYouAre100.py

On running the tests (command to run the test - pytest -v test_whenYouAre100.py) All tests passed.


Time to bring in hypothesis

Instead of all the above tests I defined just this one test


On running the test (Command to run the test - pytest --hypothesis-show-statistics test_whenYouAre100.py)



It appears that hypothesis has run 100 examples from range min_value=0 to max_value=101. 
I was not sure what those examples were run. Tried to get it to log (its called database in hypothesis) all the examples but for now I couldn't figure it out. So instead I started testing it.

I modified the function in the above code to 


With this change in code if I run the 4 unit tests (the 4 with values, 100, 101, 0 and 50) it will not catch the bug. Hopefully hypothesis can catch it?

So I ran the test written using hypothesis again and boom. The test failed at the value of i = 25.



I then changed self.age to 99 in the program and again the result was a fail 



That was pretty cool. 

So from the first bit of exploring hypothesis it does look pretty handy. Slick, easy to integrate, like how it prints failed tests and I can let a library pick examples for me.

But, I absolutely hate the tall claims. Words like "finding edge cases in your code you wouldn’t have thought to look for". Really?

If I were to run with only 

without integers range defined - I might have had to run this test n times to get the test to catch the bug!

Saying that I do like the library for now. So stay tuned for more notes on it.



Test with me @

Tweets