Structural Testing
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 1
A reminder: software modelling and testing methods
White-box testing vs black-box testing • White box = based on the code
• Black box = not based on the code
Structural testing vs functional testing
• Structural testing = judging test suite thoroughness
based on the structure of the program
• Functional testing – Judging test suite thoroughness
based on the desired functionality (the specification) (c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 2
Learning objectives
• Understand rationale for structural testing
– How structural (code-based or glass-box) testing complements functional (black-box) testing
• Recognize and distinguish basic terms – Adequacy, coverage
• Recognize and distinguish characteristics of common structural criteria
• Understand practical uses and limitations of structural testing
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 3
Structural testing
• Judging test suite thoroughness based on the structure of the program itself
– Also known as white-box, glass-box, or code- based testing
– To distinguish from functional (requirements-based, black-box testing)
– Structural testing is still testing product functionality against its specification. Only the measure of thoroughness has changed.
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 4
Why structural (code-based) testing?
• One way of answering the question What is missing in our test suite?
– If part of a program is not executed by any test case in the suite, faults in that part cannot be exposed
– But whats a part?
• Typically, a control flow element or combination:
• Statements (or CFG nodes), Branches (or CFG edges) • Fragments and combinations: Conditions, paths
• Targets
• Complements functional testing: Another way to recognize cases that are treated differently
– Recall fundamental rationale: Prefer test cases that are treated differently over cases treated the same
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 5
No guarantees
• Executing all control flow elements does not guarantee finding all faults
– Execution of a faulty statement may not always result in a failure
• The state may not be corrupted when the statement is executed with some data values
• Corruptstatemaynotpropagatethroughexecutionto eventually lead to failure
• What is the value of structural coverage?
– Increases confidence in thoroughness of testing
• Removes some obvious inadequacies
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 6
Structural testing complements
functional testing
• Control flow testing includes cases that may not be identified from specifications alone
– Typical case: implementation of a single item of the specification by multiple parts of the program
– Example: hash table collision (invisible in interface spec)
• Test suites that satisfy control flow adequacy criteria could fail in revealing faults that can be caught with functional criteria
– Typical case: missing path faults
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 7
Structural testing in practice
• Create functional test suite first, then measure structural coverage to identify see what is missing
• Interpret unexecuted elements
– may be due to natural differences between specification and implementation
– or may reveal flaws of the software or its development process
• inadequacy of specifications that do not include cases present in the implementation
• coding practice that radically diverges from the specification • inadequate functional test suites
• Attractive because automated
– coverage measurements are convenient progress indicators
– sometimes used as a criterion of completion
• use with caution: does not ensure effective test suites
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 8
criteria that applies to each type of coverage.
Table 1. Types of Structural Coverage
Coverage Criteria
Statement Coverage
Decision Coverage
Condition Coverage
Condition/ Decision Coverage
MC/DC
Multiple Condition Coverage
Every point of entry and exit in the program has been invoked at least once
•
•
•
•
•
Every statement in the program has been invoked at least once
•
Every decision in the program has taken all possible outcomes at least once
•
•
•
•
Every condition in a decision in the program has taken all possible outcomes at least once
•
•
•
•
Every condition in a decision has been shown to independently affect that decision’s outcome
•
•8
Every combination of condition outcomes within a decision has been invoked at least once
•
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 9
Statement testing
• Adequacy criterion: each statement (or node in the CFG) must be executed at least once
• Coverage:
# executed statements
# statements
• Rationale: a fault in a statement can only be revealed by executing the faulty statement
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 10
Statements or blocks?
• Nodes in a control flow graph often represent basic blocks of multiple statements
– Some standards refer to basic block coverage or node coverage
– Difference in granularity, not in concept
• No essential difference
– 100% node coverage <-> 100% statement coverage • but levels will differ below 100%
– A test case that improves one will improve the other • though not by the same amount, in general
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 11
Example
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 12
Example
int cgi_decode (char *encoded , char *decoded )
A
B
char c ; C c = *eptr ;
if (c == ‘+’) {
False elseif (c == ‘%’) {
{char *eptr = encoded ; char *dptr = decoded ; int ok = 0;
while (*eptr ) { False True
True
D
True
*dptr = ‘ ‘; }
G
E
False
else F *dptr = *eptr;
}
int digit _high = Hex _Values [*(++eptr )]; int digit _low = Hex_Values[*(++eptr)]; if(digit_high ==-1||digit_low==-1){
False
True
H ok=1; I }
L
*dptr = ‘\0’;
return ok ; }
M
else{
*dptr = 16 * digit_high + digit _low;
}
+ +dptr; + +eptr; }
T0 =
{, test, test+case%1Dadequacy} 17/18 = 94% Stmt Cov.
T1 =
{adequate+test%0Dexecuti on%7U}
18/18 = 100% Stmt Cov. T2 =
{%3D, %A, a+b, test}
18/18 = 100% Stmt Cov.
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 13
Coverage is not size
• Coverage does not depend on the number of test cases
– T0 , T1 : T1 >coverage T0 T1
• Minimizing test suite size is seldom the goal
– small test cases make failure diagnosis easier
– a failing test case in T2 gives more information for fault localization than a failing test case in T1
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 14
All statements can miss some cases
• Complete statement coverage may not imply
executing all branches in a program
• Example:
– Suppose block F were missing
– Statement adequacy would not require false branch from D to L
T3 =
{, +%0D+%4J} 100% Stmt Cov.
No false branch from D
int cgi_decode (char *encoded , char *decoded )
{char *eptr = encoded ; char *dptr = decoded ; int ok = 0;
A
while (*eptr ) {
B
False
False
True
char c ;
c = *eptr ;
if (c == ‘+’) {
C
False
True
elseif (c == ‘%’) {
D
*dptr = ‘ ‘; }
E
True
False
True
else {
*dptr = *eptr; }
F
int digit_high = Hex_Values [*(++eptr)]; int digit_low = Hex_Values [*(++eptr)]; if (digit_high == -1 || digit_low == -1) {
G
else {
*dptr = 16 * digit_high + digit _low;
}
H
ok = 1; }
I
*dptr = ‘\0’; return ok ; }
M
+ +dptr;
+ +eptr; }
L
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 15
are both conditions.
e e
s
t
s a
s w
during software testing. Achieving statement coverag the context of DO-178B, reachable based on test cas
Illustration of missing cases
coverage is considered a weak criterion because it i
for statement coverage
Consider the following code segment (ref. 12):
if (x>1)and(y=0)thenz:=z/x;endif; if (z=2)or(y>1)thenz:=z+1;endif;
By choosing x = 2, y = 0, and z = 4 as input to this
If there is an OR instead of AND in the first statement, statoenmceen.t cHoovewraegverw,iillfnaont doirsciosvceordtheids pbryobmleimstake in the firs
detect a problem. This makes sense because analysis coverage criterion. According to Myers (ref. 10), “
Analysis of logical expressions is not a part of statement
generally considered useless.” At best, statement cover
coverage criteria!
The remaining measures in Table 1 consider variou (cT) 2o007hMiagurho Pleizgzèh&tMidchiafl fYoeunrgences between these measCuh 1r2e, sli,dew16e
Branch testing
• Adequacy criterion: each branch (edge in the CFG) must be executed at least once
• Coverage:
# executed branches
# branches
T3 = {, +%0D+%4J}
100% Stmt Cov. 88% Branch Cov. (7/8 branches)
T2 = {%3D, %A, a+b, test}
100% Stmt Cov. 100% Branch Cov. (8/8 branches)
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 17
Statements vs branches
• Traversing all edges of a graph causes all nodes to be visited
– So test suites that satisfy the branch adequacy criterion for a program P also satisfy the statement adequacy criterion for the same program
• The converse is not true (see T3)
– A statement-adequate (or node-adequate) test suite may not be branch-adequate (edge-adequate)
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 18
All branches can still miss conditions
• Sample fault: missing operator (negation)
digit_high == 1 || digit_low == -1
• Branchadequacycriterioncanbesatisfiedbyvaryingonly digit_low
– Thefaultysub-expressionmightneverdeterminethe result
– We might never really test the faulty condition, even though we tested both outcomes of the branch
• Called“decisiontesting”inthetable:eventhoughittestsT and F outcomes, can still miss faults:
– for multicondition (A or B), TF and FF will cover both branches, but the effect of B is not tested
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 19
Condition testing
• Branch coverage exposes faults in how a computation has been decomposed into cases
– intuitively attractive: check the programmers case analysis
– but only roughly: groups cases with the same outcome
• Condition coverage considers case analysis in more detail
– also individual conditions in a compound Boolean expression
• e.g., both parts of digit_high == 1 || digit_low == -1
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 20
Basic condition testing
• Adequacy criterion: each basic condition must be executed at least once
• Coverage of basic conditions:
# truth values taken by all basic conditions 2 * # basic conditions
• Formulticondition(AorB),TFandFTwillcoverbothoutcomes of conditions, but not all branches
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 21
Basic conditions vs branches
• Basic condition adequacy criterion can be satisfied without satisfying branch coverage
T4 = {first+test%9Ktest%K9}
satisfies basic condition adequacy
does not satisfy branch condition adequacy
Branch and basic condition are not comparable (neither implies the other)
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 22
Covering branches and conditions: condition/decision coverage
• Branched and conditions adequacy: – cover all conditions and all decisions
• Compound condition adequacy:
– Cover all possible evaluations of compound conditions
– Cover all branches of a decision tree
• Formulticondition(AandB), TT and FF covers condition/decision
• stillindistinguishablefrom (A or B)
true digit_low == 1
false FALSE
digit_high == -1
true TRUE
false FALSE
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 23
What do we do?
Multiple condition coverage requires to check all combinations of values:
exhaustive, but exponential in size of the condition
(((a || b) && c) || d) && e
Test abcde Case
(1) T—T—T (2) FTT—T (3) T—FTT (4) FTFTT (5) FF—TT (6) T—T—F (7) FTT—F (8) T—FTF (9) FTFTF (10) FF—TF (11) T—FF— (12) FTFF— (13) FF—F—
all possible assignments of values are checked
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 24
Modified condition/decision (MC/DC)
• Motivation: Effectively test important combinations of conditions, without exponential blowup in test suite size
– Important combinations means: Each basic condition shown to independently affect the outcome of each decision
• Requires:
– For each basic condition C, two test cases,
– values of all evaluated conditions except C are the same
– compound condition as a whole evaluates to true for one and false for the other
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 25
How many test cases do we need for MC/DC?
• For(AorBorC): – TFF
– FTF – FFT – FFF
• In general, every condition should be checked for its effect on the decision: N+1 cases (where N is the number of conditions)
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 26
MC/DC: linear complexity
• N+1 test cases for N basic conditions
(((a || b) && c) || d) && e
Test a b c Case
(1) true — true
(2) false true true
(3) true — false
(6) true — true (11) true — false (13) false false —
d e
— true — true true true — false false — false —
outcome
true true true false false false
• Underlined values independently affect the output of the decision
• Required by the RTCA/DO-178B standard
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 27
Comments on MC/DC
• MC/DC is
– basic condition coverage (C)
– branch coverage (DC)
– plus one additional condition (M):
every condition must independently affect the decisions output
• It is subsumed by compound conditions and subsumes all other criteria discussed so far
– stronger than statement and branch coverage
• A good balance of thoroughness and test size (and therefore widely used)
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 28
When is it used?
• In software where implementation is much more complex than the specification
• Examples: avionics and space navigation
space shuttle docking to the space station
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 29
Paths? (Beyond individual branches)
•
Should we explore sequences of branches (paths) in the control flow?
Many more paths than branches
– A pragmatic compromise will be needed
•
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 30
Path adequacy
• Decision and condition adequacy criteria consider individual program decisions
• Path testing focuses consider combinations of decisions along paths
• Adequacy criterion: each path must be executed at least once
• Coverage:
# executed paths
# paths
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 31
Practical path coverage criteria
• The number of paths in a program with loops is unbounded
– the simple criterion is usually impossible to satisfy
• For a feasible criterion: Partition infinite set of paths into a finite number of classes
• Useful criteria can be obtained by limiting – the number of traversals of loops
– the length of the paths to be traversed – the dependencies among selected paths
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 32
Boundary interior adequacy for cgi-decode
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 33
Limitations of boundary interior adequacy
• The number of paths can still grow exponentially
if (a) { • The subpaths through this control
S1; }
if (b) { S2;
}
if (c) {
S3; }
…
if (x) {
Sn; }
flow can include or exclude each of the statements Si, so that in total N branches result in 2N paths that must be traversed
• Choosing input data to force execution of one particular path may be very difficult, or even impossible if the conditions are not independent
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 34
Loop boundary adequacy
• Criterion: A test suite satisfies the loop boundary adequacy criterion iff for every loop:
– In at least one test case, the loop body is iterated zero times
– In at least one test case, the loop body is iterated once
– In at least one test case, the loop body is iterated more than once
• Corresponds to the cases that would be considered in a formal correctness proof for the loop
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 35
Cyclomatic adequacy
• Cyclomatic number:
number of independent paths in the CFG
– A path is representable as a bit vector, where each component of the vector represents an edge
– Dependence is ordinary linear dependence between (bit) vectors
• If e = #edges, n = #nodes, c = #connected components of a graph, it is:
– e-n+2cforanarbitrarygraph – e-n+2foraCFG
• Cyclomatic coverage counts the number of independent paths that have been exercised, relative to cyclomatic complexity
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 36
Target coverage
• Coverage targets are elements of the program/design defined by the designer that need to be covered (hit) by the test suite
• Usually, coverage targets point to especially suspicious elements
– the most complex
– the most critical
– the elements that were recently changed – elements that are likely to contain bugs
• May be used in conjunction with any other coverage metric
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 37
Test targets
int foo(int a, int b, int c){
if (a > b)
c = 1; /* target 1 */
else {
if (b > c) c = 2;
else call function goo(b); /* target 2 */
}
if (c == 3)
call function moo; if (b == 2)
b = a; return a;
}
/* target 3 */ /* target 4 */
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 38
Test targets
int foo(int a, int b, int c){
if (a > b)
c = 1; /* target 1 – covered by <2,1,1>
else {
if (b > c) c = 2;
else call function goo(b); /* target 2 – covered by <2,1,1> */
}
if (c == 3)
call function moo; if (b == 2)
b = a; return a;
}
/* target 3 – covered by <2,1,3> /* target 4 – covered by <2,2,2>
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 39
Hitting targets, adequacy, and coverage
• A test hits a target if it executes this target
• A test t1 is closer to hitting a target than a test t2 if t1 is closer than t2 to another test t3 that hits the target
– <1,1,1> does not hit target 1
– <0,1,1> also does not hit target 1
– <1,1,1> is closer to hitting target 1 than <0,1,1>
• Adequacy: all targets are hit
• Coverage:
# hit targets
# targets
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 40
Towards procedure call testing
• The criteria considered to this point measure coverage of control flow within individual procedures.
– not well suited to integration or system testing
• Choose a coverage granularity commensurate with the granularity of testing
– if unit testing has been effective, then faults that remain to be found in integration testing will be primarily interface faults, and testing effort should focus on interfaces between units rather than their internal details
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 41
Procedure call testing • Procedure entry and exit testing
– procedure may have multiple entry points (e.g., Fortran) and multiple exit points
• Call coverage
– The same entry point may be called from many points
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 42
Subsumption relation
Path Testing
Boundary interior testing
Compound condition testing
MC/DC testing
Branch and condition testing
Cyclomatic testing
Loop boundary testing
LCSAJ testing
Branch testing
Statement testing
Basic condition testing
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 43
PRACTICAL CRITERIA THEORETICAL CRITERIA
Satisfying structural criteria
• Sometimes criteria may not be satisfiable
– The criterion requires execution of
• statements that cannot be executed as a result of
– defensive programming
– code reuse (reusing code that is more general than strictly required for the application)
• conditions that cannot be satisfied as a result of – interdependent conditions
• paths that cannot be executed as a result of – interdependent decisions
(c) 2007 Mauro Pezzè & Michal Young
Ch 12, slide 44
Satisfying structural criteria
• Large amounts of fossil code may indicate serious maintainability problems
– But some unreachable code is common even in well- designed, well-maintained systems
• Solutions:
– make allowances by setting a coverage goal less than 100%
– require justification of elements left uncovered
• RTCA-DO-178B and EUROCAE ED-12B for modified MC/DC
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 45
Summary
• We defined a number of adequacy criteria – NOT test design techniques!
• Different criteria address different classes of errors
• Full coverage is usually unattainable
– Remember that attainability is an undecidable problem!
• …and when attainable, inversion is usually hard
– How do I find program inputs allowing to cover something
buried deeply in the CFG?
– Automated support (e.g., symbolic execution) may be necessary
• Therefore, rather than requiring full adequacy, the degree of adequacy of a test suite is estimated by coverage measures
– May drive test improvement
(c) 2007 Mauro Pezzè & Michal Young Ch 12, slide 46
Home reading
• Chapter 12 of the book Software Testing and Analysis, by Mauro Pezze and Michal Young
– Structural testing
(c) 2007 Mauro Pezzè & Michal Young Ch 1, slide 47