Test Case Selection and Adequacy Criteria
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 1
Learning objectives
• Understand the purpose of defining test adequacy criteria, and their limitations
• Understand basic terminology of test selection and adequacy
• Know some sources of information commonly used to define adequacy criteria
• Understand how test selection and adequacy criteria are used
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 2
Adequacy: We can’t get what we want • What we would like:
– A real way of measuring effective testing
If the system system passes an adequate suite of test cases, then it must be correct (or dependable)
• But that’s impossible!
– Adequacy of test suites, in the sense above, is provably undecidable (because it implies correctness).
• So we will have to settle on weaker indications for adequacy
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 3
Practical Adequacy Criteria
• Criteria that identify inadequacies in test suites.
– Examples
– if the specification describes different treatment in two cases, but the test suite does not check that the two cases are in fact treated differently, we may conclude that the test suite is inadequate to guard against faults in the program logic.
– If no test in the test suite executes a particular program statement, the test suite is inadequate to guard against faults in that statement.
• If a test suite fails to satisfy some criterion, the obligation that has not been satisfied may provide some useful information about improving the test suite.
• If a test suite satisfies all the obligations by all the criteria, we do not know definitively that it is an effective test suite, but we have some evidence of its thoroughness.
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 4
Some useful terminology
• Test case: a set of inputs, execution conditions, and a pass/fail criterion.
• Test case specification: a requirement to be satisfied by one or more test cases (such as “two int inputs”).
• Test obligation: a partial test case specification, requiring some property deemed important to thorough testing (such as “check all statements”).
• Test suite: a set of test cases.
• Test or test execution: the activity of executing test
cases and evaluating their results.
• Adequacy criterion: a predicate that is true (satisfied) or false (not satisfied) of a áprogram, test suiteñ pair (satisfied if every obligation is satisfied by some test)
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 5
Example
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 6
Test obligations
• General rule: using an empty sequence wherever a sequence appears as an input
– Test obligation: requires the empty string as input. • Structural selection of test cases:
– Test obligation: (1st clause of IF (line 15) ⇐ true)∧ (2nd clause ⇐ false)
– Test obligation: (1st clause of IF (line 15) ⇐ false)∧ (2nd clause ⇐ true)
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 7
Where do test obligations come from?
• Functional (black box, specification-based): from software specifications
• Example: If spec requires robust recovery from power failure, test obligations should include simulated power failure
• White box (structural): from code
• Example: Traverse each program loop one or more times.
• Model-based: from model of system
• Models used in specification or design, or derived from code
• Example: Exercise all transitions in communication protocol model
• Fault-based: from hypothesized faults (common bugs)
• Example: Check for buffer overflow handling (common vulnerability) by testing on very large inputs
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 8
Adequacy criteria
• Adequacy criterion = set of test obligations
• A test suite satisfies an adequacy criterion if
– all the tests succeed (pass)
– every test obligation in the criterion is satisfied by at least one of the test cases in the test suite.
– Example:
the statement coverage adequacy criterion is satisfied by test suite S for program P if each executable statement in P is executed by at least one test case in S, and the outcome of each test execution was “pass”. (statement coverage is 100% and all tests passed)
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 9
Test Coverage Criteria (metrics)
• Are heuristic measures of exhaustiveness (adequacy) of the test suite
• Used everywhere
• Depend on the amount and type of information we have about the system
• If we only have the specification and an executable (binary) – black box
• If we have the code – white box
• Some coverage criteria are generic – such as
stress (workload) coverage criteria
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 10
Examples of coverage metrics(1)
• Structural
– the percentage of executed statements; can be evolved further:
• for loops, it is usually not enough to execute them once • hit count for statements with many possible behaviours
– the percentage of executed branches
– the percentage of executed paths on the CFG
– the percentage of variable assignments executed
is there an order between them? are some of them stronger than others? are all of them feasible?
(c) 2007 Mauro Pezzè & Michal Young
Ch 9, slide 11
Examples of coverage metrics(2)
• Specification-based:
– the percentage of executed functionality
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 12
Which testing is better – white-box or black-box?
• Thoroughness
• Relevance to correctness • Amount of work
black box
how would you test it?
sort(int a, int b, int c){
(c) 2007 Mauro Pezzè & Michal Young
Ch 9, slide 13
if(a == 271){
then return
{a, max(b,c),min(b,c);
} else{
bubblesort(a,b,c);
return(sorted list); }
}
Satisfiability
• Sometimes no test suite can satisfy a criterion of 100% coverage for a given program
– Dead code
– Paths on the CFG that do not exist in the program
– Example: Defensive programming style includes “can’t happen” sanity checks
if (z < 0) {
throw new LogicError(
“z must be positive here!”) }
No test suite can satisfy statement coverage for this program (if it’s correct)
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 14
Coping with Unsatisfiability
• ApproachA:excludeanyunsatisfiableobligationfromthe criterion.
– Example: modify statement coverage to require execution only of statements that can be executed.
– But we can’t know for sure which are executable!
• ApproachB:measuretheextenttowhichatestsuite
approaches an adequacy criterion.
– Example: if a test suite satisfies 85 of 100 obligations, we
have reached 85% coverage.
• Terms: An adequacy criterion is satisfied or not, a coverage measure is the fraction of satisfied obligations
• ApproachC:checkwhythecriterionisunsatisfiable(dead
code should be removed)
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 15
Comparing Criteria
• Can we distinguish stronger from weaker adequacy criteria?
• Empirical approach: Study the effectiveness of different approaches to testing in industrial practice
– What we really care about, but ...
– Depends on the setting; may not generalize from one organization or project to another
• Analytical approach: Describe conditions under which one adequacy criterion is provably stronger than another
– Stronger = gives stronger guarantees
– One piece of the overall “effectiveness” question
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 16
The subsumes relation
Test adequacy criterion A subsumes test adequacy criterion B iff, for every program P, every test suite satisfying A with respect to P also satisfies B with respect to P.
• Example:
Exercising all program branches (branch coverage)
subsumes exercising all program statements
• A common analytical comparison of closely
related criteria
– Useful for working from easier to harder levels of coverage, but not a direct indication of quality
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 17
Uses of Adequacy Criteria
• Test selection approaches
– Guidance in devising a thorough test suite
• Example: A specification-based criterion may suggest test cases covering representative combinations of values
• Revealing missing tests
– Post hoc analysis: What might I have missed with this test suite?
• Often in combination
– Example: Design test suite from specifications, then use structural criterion (e.g., coverage of all branches) to highlight missed logic
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 18
Summary
• Adequacy criteria provide a way to define a notion of “thoroughness” in a test suite
– But they don’t offer guarantees; more like design rules to highlight inadequacy
• Defined in terms of “covering” some information
– Derived from many sources: Specs, code, models, ...
• May be used for selection as well as measurement
(c) 2007 Mauro Pezzè & Michal Young Ch 9, slide 19
Home reading
• Chapter 9 of the book Software Testing and Analysis, by Mauro Pezze and Michal Young
– Test Case Selection and Adequacy
(c) 2007 Mauro Pezzè & Michal Young Ch 1, slide 20