After the confusion around not publishing Michael Bolton’s comment in response to my earlier post was sorted out, Michael has replied (thanks Michael!) with an elaborate post: All Testing is (not) Confirmatory, providing reasoning in favor of his concept of “Testing versus Checking”. Meanwhile, I had a discussion with Pradeep Soundararajan on the comment episode and he pointed out a very important thing – “We shouldn’t let the software issues come in the way of relationships”. I can’t agree more because the time which could be spent on discussing and building on the argument was spent rather on proving that I didn’t get Michael’s comment. Next time, I would rather reach out to the person with such objections via email rather than using Twitter as the platform.
Having said that, let’s start with the actual discussion. If you want to get the best out of this discussion, please read the following posts, before reading any further contents.
- Posts on “Testing versus Checking” (Michael)
- All Testing is Confirmatory (Rahul)
- All Testing is (not) confirmatory (Michael)
I am not going to quote complete sections by Michael here, rather the gist of his discussion points. Please see his original wonderful post : All Testing is (not) Confirmatory, where he has challenged my statement that all testing is confirmatory.
Michael: Relates the example of a Greek mythological character – Procrustes, and claims that I tried to fit “Confirmation” into Procustes bed.
This was an interesting analogy. Let me start with another one from Hindu Mythology (Indian Epic – Mahabharata):
“Yudhishthra (one of the Pandavas) and Duryodhana (one of the Kauravas) were sent on a world tour to see how the world is doing. When they came back, Duryodhana said the world is very bad, everyone is bad, the world is as good as hell. In contrast, Yudhisthira mentioned that the world is great, people are very good, the world is as good as heaven. (Yudhishthira and Duryodhana represent positive and negative characters in the epic, respectively)”
Moral of the story: We see the world the way we are ourselves.
Why I have related this story is to express the irony that I feel it’s Michael’s division of testing that mimics Procrustes.
Whatever I think about the myth Michael related, my argument wouldn’t be complete if I don’t answer him in the language of his analogy. So, let’s talk about the Greek myth. One essential point that Michael didn’t mention about the myth is that some experts think that Procustes had 2 beds: a smaller one and a larger one and not 1 bed. So, no guest would ever fit as he would change the choice of bed
Going by this analogy, following is why I see myself as different from Procrustes in the context of guests:
- I have a single bed for guests – I treat “testing” as a big adjustable bed – a bed, big enough to accommodate all my guests.
- I get very few guests. I don’t name the same guest and calculate bed space differently just because s/he is wearing different clothes than earlier. Most of my learning about testing happens via handful of guests. I can easily recollect their names.
- At times people have weird names. Their names may not reflect their personality. Because I have got a few guests, I *know* them. So, their names really don’t make any difference to me. I have had a lot of discussion with them, I know about their personalities and various moods. Other people might know them differently; I’m not worried about that.
- As I always expect multiple guests to be at my home, I don’t think about cutting or stretching a guest to fit into the bed. Stretching is out of question. I can’t block full bed for a single guest. Cutting is not required as I would rather adjust the bed if needed so that all my guests can co-exist on the same bed comfortably.
- The “exploration” and “confirmation” guests are permanently at my home. They never leave the bed.
- As I don’t have multiple beds, I don’t need to divide my guests. I don’t need to attach special importance to a particular guest and provide him the longer bed. There aren’t any of such choices. I don’t differentiate between guests.
This is why I think Michael mimics Procustes: (I hope Michael doesn’t take this one personally, the way I didn’t take his analogy personally, rather as a discussion to further the argument)
- He appears to be the Procustes with two beds – “Exploratory Testing” and “Confirmatory Testing”
- Exploratory testing bed for him is the longer one, comfortable, meant for humans. This is because he thinks that this bed is for those with “sapience”.
- Confirmatory testing bed is the smaller one, no cushions or mattress, meant for machines. This is because Michael thinks that this is meant for guests who do “non-sapient” work and should be treated as machines.
- Every time a guest comes, he would check guest’s clothes at that time, his mood and then decide which bed to offer. The guest might have to be stretched or cut to fit the bed as for Michael, testing is either exploratory or confirmatory.
- Exploration and confirmation don’t individually represent complete testing, so Michael would have to stretch/cut guests to fit into either of the beds.
- He doesn’t welcome multiple guests and doesn’t acknowledge that they can co-exist.
Michael: Mentions Confirmatory testing as “an approach or a mindset to testing”.
As discussed in my previous post, in my opinion, confirmatory testing is not an approach or mindset. Whatever approach to testing you choose, whatever your mindset is: Confirmation is not optional. Confirmation is an essential step in any form of testing.
Michael: Stresses that his usage of confirmatory is twofold: its role in test design & not test results, and my usage of word “previously” in definition of confirmation
His discussion is justified based on his version of “confirmatory”.
I don’t confine confirmation to test results alone as pointed out by him. Confirmation applies to assumptions (which might be a part of a written test case/automated test/exploratory test) as well as results analysis. The things would become further clear to you if you start thinking of “Confirmation” rather than “Confirmatory Testing”.
When I say “All Testing is Confirmatory”, it means all types of testing include confirmation. It’s the same way when I (or even the thought leaders behind exploratory testing) say “All Testing is Exploratory”. Exploration and Confirmation are essential parts of all types of testing. It should not be confused with whether it was *you* who explored and came up with a test case. It shouldn’t be confused with whether someone else (human/machine) executed that test and not you. I shouldn’t be confused with whether the confirmation was automated or based on human analysis (or sapience if you like). Exploration and Confirmation are unavoidable steps in all sorts of testing.
Michael: Points out that confirmation is just about showing that a product can work
I disagree. Confirmation is independent of type of assumption. An assumption could be whether a product works or does not work. So, confirmation is very much applicable to fault injections, fuzzing, defect based testing etc. Such tests would confirm that the product cannot work under certain circumstances.
Michael: While differentiating, “exploratory mindset” from “confirmatory mindset” mentions: “In this mindset, we’re typically seeking to find out how the program deals with whatever we throw at it, rather than on demonstrating that it can hit a pitch in the centre of the strike zone.”
In both these scenarios, there would be a decision point. It might be automated or done by human users. In his language, that decision point is “Is there a problem here?” When the tester attempts to answer this question, s/he is “confirming”.
Irrespective of the nature of mission, knowledge previously known or got via exploration by you or someone else, documented/undocumented test cases, when it comes to the decision point – the point at which you are forming an opinion about how good is the subject of your exploration, you are doing confirmation.
That’s the reason I do not agree with Michael’s interpretation of the word “previously”. At the moment of decision, everything you know/do before it is “previous” to it.
Michael: “I don’t think of that as confirmation (“establishing the truth or correctness of something previously believed or suspected to be the case”). I think of that as application of an oracle; as a comparison of the observed behaviour with a principle or mechanism that would allow us to recognize a problem”
As per the BBST course developed by Cem Kaner and James Bach – “An oracle is the principle or mechanism by which you recognize a problem” (I guess it was changed from the original text at this link: “http://www.testingeducation.org/k04/OracleExamples.htm” where it is mentioned: “An oracle is a mechanism for determining whether the program has passed or failed a test.” (I’ll come to discussion of Pass/Fail versus “Is there a problem here?” later). I am fine with both the definitions, I interpret both in same manner.
I agree to Michael’s point that that confirmation involves the application of an oracle. For that matter even in exploratory testing you would apply oracles and hence confirmation is a part of “exploratory testing”.
Michael: “If your hypothesis is that the product works—that is, that the product behaves in a manner consistent with the oracle heuristics—then your approach might be described as confirmatory.”
Oracle is defined in a very generic manner. It doesn’t talk of “pre-defined” oracles the way Michael puts it. Rather the definition of oracles is consistent with my interpretation of “previously” – At the moment of decision, everything you know/do before it is “previous” to it. Oracle forms the basis of that decision.
Every form of testing uses oracle(s) and hence confirmation.
Michael: “Yet the confirmatory mindset has been identified in both general psychological literature and testing literature as highly problematic. Klayman and Ha point out in their 1987 paper Confirmation, Disconfirmation, and Information in Hypothesis Testing that “In rule discovery, the positive test strategy leads to the predominant use of positive hypothesis tests, in other words, a tendency to test cases.”you think will have the target property.”
To me confirmation and disconfirmation are two sides of the same coin. The moment you make an assumption “negative”, confirmation becomes disconfirmation. The paper that Michael quotes treats confirmation as “positive testing strategy”. With all the criticism that the terms “positive testing” and “negative testing” get in various blogs and forums, I am surprised at why Michael’s has such a limiting view of “confirmation”.
I disagree to the interpretation of “confirmation” as conducting only the tests that show the software works. You can conduct a test that confirms that problem exists. Splitting tests into confirmation and disconfirmation calls for the same level of criticism as “positive” and “negative” testing. (see one of them here by Pradeep Soundararajan: http://testertested.blogspot.com/2007/04/negatives-of-negative-testing.html). Testing is Testing. Testing includes confirmation whether assumptions made about the subject of test are correct. Assumptions could very well be about software failure.
Michael: “As I see it, if we’re testing the product (rather than, say, demonstrating it), we’re not looking for confirmation of the idea that it works; we’re seeking to disconfirm the idea that it works. Or, as James Bach might put it, we’re in the illusion demolition business.”
As mentioned earlier, I have a very generic view of confirmation which covers both “positive” and “negative” test strategies.
B.t.w. James’s opinion of “testing business” is incomplete. We are not just in illusion demolition business. Proving that the software works is also a part of our business.
Going by your discussion of confirmation and disconfirmation, James view of testing business is just disconfirmation. For me it’s just confirmation with a different flavors of assumptions.
Michael: Challenges my opinion that a test has to “Pass/Fail”. Emphasizes that a better outlook would be to ask “Is there a problem here?”
I am perfectly fine with Michael’s usage of “Is there a problem here?” instead of “Pass/Fail” (and obviously a lot of testers today visibly have starting using this.). I am aware of a recent website with a similar theme. It’s a matter of interpretation. I interpret “Pass/Fail” as subjective opinion of the tester (even if reflected in an automated test). So, if a tester replies to your question “Is there a problem here?”:
- Yes: Means Fail
- No: Means Pass
- Do not know for sure: Needs further investigation
In my test reports, there’s column “remarks” where I can elaborate on my findings for each such claim. Whether you name the column “Is there a problem here?” and answer Yes/No?Don’t Know or you name the column as Pass/Fail/Under Investigation and give appropriate value, doesn’t make any difference to the value of the report.
The problem with your approach to “Is there a Problem here?” is that you are discussing it only partially. At first look, I confess, it looks pretty convincing. But the moment one starts thinking that one would have to answer this question, the difference fades away.
Michael: “I think Rahul’s notion that a test must pass or fail is confused with the idea that a test should involve the application of a stopping heuristic. For a check, “pass or fail” is essential, since a check relies on the non-sapient application of a decision rule.”
I was actually not able to understand Michael’s note on my confusion around “stopping heuristic”. How’s Pass/Fail related to “stopping heuristic”? I’ll have to check with Michael on what exactly he meant before commenting.
As an extension to my previous note, I don’t differentiate between Pass/Fail and “Is there a problem here?”. Secondly, I don’t agree that confirmation is confined to “non-sapient” testing. Confirmation is a part of every form of testing. Even the most “sapient” exploratory test would have a “sapient” confirmation step depending on nature of oracle.
Michael – “I might test a competitive product to discover the features that it offers; such tests don’t have a pass or fail component to them. A tester might be asked to compare a current product with a past version to look for differences between the two.” Michael further relates examples to illustrate that Pass/Fail is limiting.
Just comparing two products in terms of features, is not testing unless there is a decision point. It is “Exploration” which as acknowledged by me earlier, is a part of all forms of testing. Just exploration does not complete even a single test case. There has to be a confirmation step, where the tester expresses his/her opinion on the compared features (labeling it as Pass/Fail or answering “Is there a problem here?”) At the decision point, it would always be confirmation. The basis of this confirmation – the assumption - should be available, otherwise, tester wouldn’t be able to express any opinion. Till opinion is expressed, it is exploration/learning (part of testing) and not testing (complete).
As mentioned in my earlier post on the subject – “Testing should be considered complete for a given interaction only when the result of confirmation in terms of pass or fail is available”. You could extend/replace the last part as- “when results in terms of answer to the question Is there a problem here? is made available by the tester as a Yes/No.”
Michael: (Thanks for relating a performance testing scenario) Comments that Load testing (hypotheses: “The system does handle 100,000 transactions per minute”) often gets addressed with a confirmatory mindset and the tester “sets up some automation to pump 100,000 transactions per minute” and if the system stays up and exhibits no other problems, he asserts that the test passes. On stress testing he mentions: “have a different information objective here than for the load test, and we have a mission that can’t be handled by a check.”
** He sets up some automation to pump 100,000 transactions per minute** I see that Michael has got an over-simplified view of load testing.
For the scenario mentioned by Micahel, confirmation of the assumption – “this site would handle 10,000 transactions per minute” or “this site wouldn’t handle 10,000 transactions per minute” is not as straight-forward as depicted. A lot of effort goes into designing a workload distribution pattern, user simulation, data collection, plotting, correlation and analysis. “Handling” in this case is a very ambiguous word, because just of the reason that the site didn’t go down at this load, doesn’t mean that it was able to “handle” the load. “Exhibited no other problems” – What does Michael mean by no other problems and how come finding that “no other problems exist” be non-sapient? As a part of the analysis, for example, you see the memory plot of one of the servers and observe that even after reduction in load, memory consumption didn’t go down, you have found a memory leak. This “confirms” whether your assumption about the site was right or wrong. To reach this stage of confirmation, it takes lots of “sapience”. Need not say, there are pieces of work here shouting “I am exploration” as well as “I am confirmation” for a load test. In load tests, resources are never analyzed in isolation, so I fail to understand how load testing is non-sapient (Michael’s version of confirmatory approach).
Load testing does not fall into Michael’s definition of “checking” or confirmatory testing, rather just like any other forms of testing, exploration and confirmation are essential parts of it.
If, say, 10 testers are involved in this load testing project and they deal with various stages of it, some of which require less sapience, others require more sapience, that does not make load testing as non-sapient or confirmatory testing.
Ironically, many a times, stress testing of a website is “non-sapient”. You could blindly keep pumping load and call it a stress test. It wouldn’t need the “sapient” intricate design of a workload distribution pattern, effective state management, analysis of the results on servers that didn’t crash and so on.
Michael: “In the latter test, there is a confirmatory dimension if you’re willing to look hard enough for it. We “confirm” our hypothesis that, given heavy enough stress, the system will exhibit some problem. When we apply an oracle that exposes a failure like a crash, maybe one could say that we “confirm” that the the crash is a problem, or that behaviour we consider to be badis bad. Even in the former test, we could flip the hypothesis, and suggest that we’re seeking to confirm the hypothesis that the program doesn’t support a load of 100,000 transactions per minute.”
I am happy to see how easy it was for Michael to put in a single paragraph, what I’m thinking so far. The reason for this is simplicity. It’s very simple to think how confirmation is a part of every form of testing, and then infact, never think about it . No need to think and differentiate exploratory testing from confirmatory testing, just acknowledge that exploration and confirmation are essential parts of every form of testing. No need to think whether you need to bring your sapience into picture or not, have it handy all the time whether it’s about exploration or confirmation.
Testing = Exploration + Confirmation + …