Oracle deficiency?
We were in the middle of a release. It was late and we had just found a bug, a show stopper for a feature. Ben then said I always wonder - what separates a good tester from an average tester? at most times I feel it's their ability to spot the important bugs at the right time.
These words have struck with me ever since. And this happened at another site.
I observed "attack-name": "HEADER_COUNT_EXCEEDED" in an API response when I used postman to poke some of the APIs built. The error was not consistent but I just did not feel right about this. Checked with the team (lets call them teamA) testing the APIs and it appears they had never seen it before. So I checked with another team (lets call this teamB) who were consuming the APIs but building something different. It was the same response. I was aware that both teams used J-meter to test APIs and I was poking it via postman. Could that be a problem?
I built a small collection to iterate the API calls to check if its reproducible. I had 6 failures of the 10 calls. When I shared this observation with TeamA again. The recommendation was to use J-meter instead of Postman! Really? Am I the only one seeing a problem here?
Pinged my developer and asked if she was willing to pair/investigate and her reply was exactly what I was looking for - oh yea one of our remote devs spotted this yesterday but because it was not consistent we thought it was a transient issue. I love such words 'transient', 'buggered', etc it always to me mean there is something which cannot be explained logically and so could be a potential bug.
I showed her my collection and she was convinced to dig in. 15 min into tracing the logs we found out that there is a new rule set on WAF to block requests with more than X headers. Postman with accept version was sending X+1 headers and without the accept version, it was falling just under the WAF configured rule (X). This also helped us understand why J-meter never picked it up. From J-Meter, there was only X-2 headers sent, including the accept-version.
When we took this to the architect we learnt that a lot of headers passed are informational, but not essential. So a decision was made to strip out headers that are not required (except for what is required by WAF, audit and logging)
This observation could had been a potential show stopper!
Looking back at this I understand why the two teams did not spot it. It could very much be down to both using only J-meter to test APIs. But what really surprised me was the complete lack of - could that be a problem? This makes me wonder if its because of the lack of oracles?
Another learning for me from this was that - this bug could have not been picked at teams/companies who thrive for uniformity in their toolset, frameworks, processes, etc
Comments