Peter Farrell-Vinay posted the question “Does exploratory testing mean we’ve stopped caring about test coverage?”on LinkedIn here: http://www.linkedin.com/groupItem?view=&gid=690977&type=member&item=88040261&qid=75dd65c0-9736-4ac5-9338-eb38766e4c46&trk=group_most_recent_rich-0-b-ttl&goback=.gde_690977_member_88040261.gmr_690977
I’ve replied on that forum, but I wanted to restructure some of the various thoughts expressed there to make a different case.
Do exploratory testers care about coverage? If they don’t think and care about coverage, they absolutely should.
All test design is based on models
I’ve said this before: http://testaxioms.com/?q=node/11
Testing is a process in which we create mental models of the environment, the system, human nature, and the tests themselves. Test design is the process by which we select, from the infinite number possible, the tests that we believe will be most valuable to us and our stakeholders. Our test model helps us to select tests in a systematic way. Test models are fundamental to testing – however performed. A test model might be a checklist or set of criteria; it could be a diagram derived from a design document or an analysis of narrative text. Many test models are never committed to paper – they can be mental models constructed specifically to guide the tester whilst they explore the system under test.
From the tester’s point of view, a model helps us to recognise particular aspects of the system that could be the object of a test. The model focuses attention on areas of the system that are of interest. But, models almost always over-simplify the situation.
All models are wrong, some models are useful
This maxim is attributed to the statistician George Box. But it absolutely applies in our situation.
Here’s the rub with all models – an example will help. A state diagram is a model. Useful, but flawed and incomplete. It is incomplete because a real system has billions of states, not the three defined in a design document. (And the design might have a lot or little in common with the delivered system itself, by the way). So the model in the document is idealised, partial and incomplete – it is not reality. So, the formality of models does not equate to test accuracy or completeness in any way. All coverage is measured with respect to the model used to derive testable items (in this case it could be state transitions). Coverage of the test items derived from the model doesn’t usually (hardly ever?) indicate coverage of the system or technology.
The skill of testing isn’t mechanically following the model to derive testable items. The skill of testing is in the choice of the considered mix of various models. The choice of models ultimately determines the quality of the testing. The rest is clerical work and (most important) observation.
I’ve argued elsewhere that not enough attention is paid to the selection of test models. http://gerrardconsulting.com/index.php?q=node/495
Testing needs a test coverage model or models
I’ve said this before too: http://testaxioms.com/?q=node/14
Test models allow us to identify coverage items. A coverage item is something we want to exercise in our tests. When we have planned or executed tests that cover items identified by our model we can quantify the coverage achieved as a proportion of all items on the model – as a percentage.
Numeric test coverage targets are sometimes defined in standards and plans and to be compliant these targets must be met. Identifiable aspects of our test model, such as paths through workflows, transitions in state models or branches in software code can be used as the coverage items.
Coverage measurement can help to make testing more ‘manageable’. If we don’t have a notion of coverage, we may not be able to answer questions like, ‘what has been tested?’, ‘what has not been tested?’, ‘have we finished yet?’, ‘how many tests remain?’ This is particularly awkward for a test manager.
Test models and coverage measures can be used to define quantitative or qualitative targets for test design and execution. To varying degrees, we can use such targets to plan and estimate. We can also measure progress and infer the thoroughness or completeness of the testing we have planned or executed. But we need to be very careful with any quantitative coverage measures or percentages we use.
Formal and Informal Models
Models and coverage items need not necessarily be defined by industry standards. Any model that allows coverage items to be identified can be used.
My definition is this: a Formal Model allows coverage items to be reliably identified on the model. A quantitative coverage measure can therefore be defined and used as a measurable target (if you wish).
Informal Models tend to be checklists or criteria used to brainstorm a list of coverage items or to trigger ideas for testing. These lists or criteria might be pre-defined or prepared as part of a test plan or adopted in an exploratory test session.
Informal models are different from formal models in that the derivation of the model itself is dependent on the experience, intuition and imagination of the practitioner using them so coverage using these models can never be quantified meaningfully. We can never know what ‘complete coverage’ means with respect to these models.
Needless to say, tests derived from an informal model are just as valid as tests derived from a formal model if they increase our knowledge of the behaviour or capability of our system.
Risk-based testing is an informal model approach – there is no way to limit the number of risks that can be identified. Is that bad? Of course not. It’s just that we can’t define a numeric coverage target (other than ‘do some tests associated with every serious risk’). Risk identification, assessments etc. are subjective. Different people would come up with different risks, described differently, with different probabilities and consequences. Different risks would be included/omitted; some risks would be split into micro-risks or not. It’s subjective. All risks aren’t the same so %coverage is meaningless etc. The formality associated with risk-based approaches relates mostly to the level of ceremony and documentation and not the actual technique of identifying and assessing risks. It’s still an informal technique.
In contrast, two testers given the same state transition diagram or state table asked to derive, say, state transitions to be covered by tests, would come up with the same list of transitions. Assuming a standard presentation for state diagrams can be agreed, you have an objective model (albeit flawed, as already suggested).
Coverage does not equal quality
A coverage measure (based on a formal model) may be calculated objectively, but there is no formula or law that says X coverage means Y quality or Z confidence. All coverage measures give only indirect, qualitative, subjective insights into the thoroughness or completeness of our testing. There is no meaningful relationship between coverage and the quality of systems.
So, to return to Peter’s original question “Does exploratory testing mean we’ve stopped caring about test coverage?” Certainly not, if the tester is competent.
Is the value of testing less because informal test/coverage models are used rather than formal ones? No one can say – there is no data to support that assertion.
One ‘test’ of whether ANY tester is competent is to ask about their models and coverage. Most testing is performed by people who do not understand the concept of models because they were never made aware of them.
The formal/informal aspects of test models and coverage are not a criteria for deciding whether planned/documented v exploratory is best because planned testing can use informal models and ET can use formal models.
Ad-Hoc Test Models
Some models can be ad-hoc – here and now, for a specific purpose – invented by the tester just before or even during testing. If, while testing, a tester sees an opportunity to explore a particular aspect of a system, he might use his experience to think up some interesting situations on-the-fly. Nothing may be written down at the time, but the tester is using a mental model to generate tests and speculate how the system should behave.
When a tester sees a new screen for the first time, they might look at the fields on screen (model: test all the data fields), they might focus on the validation of numeric fields (model: boundary values), they might look at the interactions between checkboxes and their impact on other fields visibility or outcomes (model: decision table?) or look at ways the screen could fail e.g. extreme values, unusual combinations etc. (model: failure mode or risk-based). Whatever. There are hundreds of potential models that can be imagined for every feature of a system.
The very limited number of test models associated with textual requirements are just that – limited – to the common ones taught in certification courses. Are they the best models? Who knows? There is very little evidence to say they are. Are they formal – yes, in so far objective definitions of the models (often called test techniques) exist. Is formal better than informal/ad-hoc? That is a cultural or value-based decision – there’s little or no evidence other than anecdotal to justify the choice.
ET exists partly to allow testers to do much more testing than that limited by the common models. ET might be the only testing used in some contexts or it might be the ‘extra testing on the top’ of more formally planned, documented testing. That’s a choice made by the project.
Certification Promotes Testing as a Clerical Activity
This ‘clerical’ view of testing is what we have become accustomed to (partly because of certification). The handed-down or ‘received wisdom’ of off-the-shelf models are useful in that they are accessible, easy to teach and mostly formal (in my definition). There were, when I last looked, 60+ different code coverage models possible in plain vanilla program languages. My guess is there are dozens associated with narrative text analysis, dozens associated with usage patterns, dozens associated with integration and messaging strategies. And for every formal design model in say, UML, there are probably 3-5 associated test models – for EACH. Certified courses give us five or six models. Most testers actually use one or two (or zero).
Are the stock techniques efficient/effective? Compared to what? They are taught mostly as a way of preparing documentation to be used as test scripts. They aren’t taught as test models having more or less effectiveness or value for money to be selected and managed. They are taught as clerical procedures. The problem with real requirements is you need half a dozen different models on each page, on each paragraph even. Few people are trained/skilled enough to prepare good designed, documented tests. When people talk about requirements coverage it’s as sophisticated as saying we have a test that someone thinks relates to something mentioned in that requirement. Hey – that’s subjective again – subjective, not very effective and also very expensive.
With Freedom of Model Choice Comes Responsibility
A key aspect of exploratory testing is that you should not be constrained but should be allowed and encouraged to choose models that align with the task in hand so that they are more direct, appropriate and relevant. But the ‘freedom of model choice’ applies to all testing, not just exploratory, because at one level, all testing is exploratory (http://gerrardconsulting.com/index.php?q=node/588).
In future, testers need to be granted the freedom of choice of test models but for this to work, testers must hone their modelling skills. With freedom comes responsibility. Given freedom to choose, testers need to make informed choices of model that are relevant to the goals of their testing stakeholders. It seems to me that the testers who will come through the turbulent times ahead are those who step up to that responsibility.
Sections of the text in this post are lifted from the pocketbook http://testers-pocketbook.com