Functional testing and Random testing
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 1
We are starting to study ways to create good test suites
White-box testing vs black-box testing • White box = based on the code
• Black box = not based on the code
Structural testing vs functional testing
• Structural testing = judging test suite thoroughness
based on the structure of the program
• Functional testing – Judging test suite thoroughness
based on the desired functionality (the specification) (c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 2
Ways to create good test suites – cont.
Systematic testing (tailored to this particular software):
• Structural testing is white box testing • Functional testing is black box testing
General types of testing (applicable regardless of the software) – for example:
• Random testing • Stress testing
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 3
Learning objectives
• Understand the rationale for systematic (non- random) selection of test cases
– Understand the basic concept of partition testing and its underlying assumptions
• Understand why functional test selection is a primary, base-line technique
– Why we expect a specification-based partition to help select valuable test cases
• Distinguish functional testing from other systematic testing techniques
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 4
Functional testing
• Functional testing: Deriving test cases from program specifications
• Functional refers to the source of information used in test case design, not to what is tested
• Also known as:
– specification-based testing (from specifications) – black-box testing (no view of the code)
• Functional specification = description of intended program behavior
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 5
Systematic vs Random Testing
• Random (uniform):
– Pick possible inputs uniformly
– Avoids designer bias
• A real problem: The test designer can make the same logical mistakes and bad assumptions as the program designer (especially if they are the same person)
– But treats all inputs as equally valuable
• Systematic (non-uniform):
– Try to select inputs that are especially valuable
– Usually by choosing representatives of classes that are apt to fail often or not at all
• Functional testing is systematic testing
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 6
Random testing
• We define some distribution over the set of possible inputs
• Test cases are generated based on this distribution
• The resulting test suite is not related to a particular implementation at all (no designer bias)
• A good strategy when we don’t have anything better
• Example: a thermostat
– black box
– should work in the range of temperatures
– our distribution can favour extreme temperatures or frequently occurring temperatures
(c) 2007 Mauro Pezzè & Michal Young
Ch 10, slide 7
What is Random Testing ?
vAll of the test inputs are generated randomly (often using a tool).
vİt is a black box testing
Why we use RT ?
• Advantage of easily estimating software reliability from test outcomes. Test inputs are randomly generated according to an operational profile, and failure times are recorded
• The data obtained from random testing can then be used to estimate reliability.
– Other testing methods cannot be used in this way to estimate software reliability
• Use of random test inputs may save some of the time and effort that more thoughtful test input selection methods require.
How we use RT?
Random Testing as a four-step procedure:
1. The input domain is identified
2. Test inputs are selected independently from the domain
3. The system under test is executed on these inputs. This inputs constitude a random test set.
4. The results are compared to the system spesification. The test is a failure if any input leads to incorrect results; otherwise it is a success.
Main types of random testing techniques:
1. Random input data generation
2. Random sequence of data input (sometimes called as stochastic testing)
3. Random data selection from existing database
It is possible to combine all the above testing techniques.
When we use RT ?
• If we have a lot of time and a goal of increasing reliability
• If we don’t have enought time for testing but we must do something.
– We must consider the time needed to write a random test generator vs the time to write a set of directed tests(or generators).
• To ensure that our tests are sufficiently random, and they cover specifications.
Example:
• Example:aninputwithvaliddomainofintegests1to100.
• T ester would randomly , or unsystematically , select values from within that domain;
– for example, the values 55,24,3
• Are the three values adequate to show that the module meets its specification when the tests are run ? Should additional or fewer values be used to make the most effective use of resources ?
• Arethereanyinputvalues,otherthanthoseselected,more likely to reveal defects ? For example, should positive integers at the beginning or end of the domain be specifically selected as inputs ?
• Shouldanyvaluesoutsidethevaliddomainbeusedastest inputs? For example, should test data include floating point values, negative values, or integer values greater than 100 ?
• Jcheck
• Randoop • Yeti-TEST
RT Tools
When not to do random testing?
• Non-uniform distribution of faults
• Example: Java class “roots” applies quadratic equation
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 15
Java example with a bug
when both q and a are 0, the program crashes
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 16
When not to do random testing?
• Non-uniform distribution of faults
• Example: Java class “roots” applies quadratic equation
Incomplete implementation logic: Program does not properly handle the case in which b2 – 4ac =0 and a=0
Failing values are sparse in the input space — needles in a very big haystack. Random sampling is unlikely to choose a=0.0 and b=0.0
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 17
What is the purpose of testing? To find faults!
• To estimate the proportion of faults to correct cases, sample randomly
– Reliability estimation requires unbiased samples for valid statistics.
• To find faults and remove them, look systematically (non-uniformly) for faults
– Unless there are a lot of faults in the program, a random sample will not be effective at finding them
– We need to use everything we know about faults
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 18
Systematic Partition Testing
Failure (valuable test case) No failure
Failures are sparse in the space of possible inputs …
… but dense in some parts of the space
If we systematically test some cases from each part, we will include the dense parts
Functional testing is one way of drawing pink lines to isolate regions with likely failures
The space of possible input values (the haystack)
(c) 2007 Mauro Pezzè & Michal Young
Ch 10, slide 19
The partition principle
• Exploit some knowledge to choose samples that are more likely to include “special” or trouble-prone regions of the input space
– Failures are sparse in the whole input space …
– … but we may find regions in which they are dense
• (Quasi*-)Partition testing: separates the input space
into classes whose union is the entire space » *Quasi because: The classes may overlap
• Desirable case: Each fault leads to failures that are dense (easy to find) in some class of inputs
– sampling each class in the quasi-partition selects at least one input that leads to a failure, revealing the fault
– seldom guaranteed; we depend on experience-based heuristics
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 20
Functional testing: exploiting the specification
• Functional testing uses the specification (formal or informal) to partition the input space
– E.g., specification of “roots” program suggests division between cases with zero, one, and two real roots
• Test each category, and boundaries between categories
– No guarantees, but experience suggests failures often lie at the boundaries (as in the “roots” program)
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 21
Why functional testing?
• The base-line technique for designing test cases
– Timely
• Often useful in refining specifications and assessing
testability before code is written – Effective
• findssomeclassesoffault(e.g.,missinglogic)thatcan elude other approaches
– Widely applicable
• to any description of program behavior serving as spec
• at any level of granularity from module to system testing.
– Economical
• typically less expensive to design and execute than structural (code-based) test cases
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 22
Early functional test design
• Program code is not necessary
– Only a description of intended behavior is needed
– Even incomplete and informal specifications can be used
• Although precise, complete specifications lead to better test suites
• Early functional test design has side benefits
– Often reveals ambiguities and inconsistency in spec
– Useful for assessing testability
• And improving test schedule and budget by improving spec
– Useful explanation of specification
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 23
Functional versus Structural: Classes of faults
• Different testing strategies (functional, structural, fault-based, model-based) are most effective for different classes of faults
• Functional testing is best for missing logic faults
– A common problem: Some program logic was simply forgotten
– Structural (code-based) testing will never focus on code that isn’t there!
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 24
Functional vs structural test: granularity levels
• Functional test applies at all granularity levels:
– Unit
– Integration – System
– Regression
(from module interface spec)
(from API or subsystem spec)
(from system requirements spec)
(from system requirements + bug history)
• Structural (code-based) test design applies to relatively small parts of a system:
– Unit
– Integration
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 25
Steps: From specification to test cases
1. Decompose the specification
If the specification is large, break it into independently testable
features to be considered in testing
2. Select representatives
Representative values of each input, or
Representative behaviors of a model
Often simple input/output transformations don’t describe a system. We use models in program specification, in program design, and in test design
3. Form test specifications
Typically: combinations of input values, or model behaviors
4. Produce and execute actual tests
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 26
From specification to test cases
Functional Specifications
Independently
Testable Feature
Representative Values
Model
Test Case Specifications
Test Cases
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 27
Simple example: Postal code lookup
• Input: ZIP code (5-digit US Postal code)
• Output: List of cities
• What are some representative values (or classes of value) to test?
(c) 2007 Mauro Pezzè & Michal Young
Ch 10, slide 28
Example: Representative values
Simple example with one input, one output
• Correct zip code
– With 0, 1, or many cities
Note prevalence of boundary values (0 cities, 6 characters) and error cases
• Malformed zip code
– Empty; 1-4 characters; 6 characters; very long
– Non-digit characters
– Non-character data
(c) 2007 Mauro Pezzè & Michal Young
Ch 10, slide 29
Summary
• Functional testing, i.e., generation of test cases from specifications is a valuable and flexible approach to software testing
– Applicable from very early system specs right through module specifications
• (quasi-)Partition testing suggests dividing the input space into (quasi-)equivalent classes
– Systematic testing is intentionally non-uniform to address special cases, error conditions, and other small places
– Dividing a big haystack into small, hopefully uniform piles where the needles might be concentrated
(c) 2007 Mauro Pezzè & Michal Young Ch 10, slide 30
Home reading
• Chapter 10 of the book Software Testing and Analysis, by Mauro Pezze and Michal Young
– Functional testing
(c) 2007 Mauro Pezzè & Michal Young Ch 1, slide 31