SWEN90010 – High Integrity
Systems Engineering Software Fault Tolerance
Toby MD 8.17 (Level 8, Doug McDonell Bldg)
http://people.eng.unimelb.edu.au/tobym @tobycmurray
Copyright By PowCoder代写 加微信 powcoder
SOFTWARE FAULT TOLERANCE
Hardware Redundancy
Analogy to motivate software redundancy:
3 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Software Redundancy
We will see two techniques:
N-Version Programming Recovery Blocks
4 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Software Redundancy
N-Version Programming: Execute multiple instances of the software and vote on the output
5 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Software Redundancy
N-Version Programming: Execute multiple instances of the software and vote on the output
Hardware fails nondeterministically, implying independence of failures.
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Software Redundancy
N-Version Programming: Execute multiple instances of the software and vote on the output
Hardware fails nondeterministically, implying independence of failures.
But software is (often) deterministic (i.e. no independence). 5 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Software Redundancy
N-Version Programming: Execute multiple instances of the software and vote on the output
Hardware fails nondeterministically, implying independence of failures.
But software is (often) deterministic (i.e. no independence).
(c.f. reliability blocks from SWEN90006)
5 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Software Independence
Requires multiple software versions with:
Independent designs Independent implementations Independent software engineering teams
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Software Independence
Requires multiple software versions with:
Independent designs Independent implementations Independent software engineering teams
Expensive!
Common Mode Failures
Failures (in otherwise independent components) that are not statistically independent.
Causes of Software Common Mode Failures:
7 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Common Mode Failures
Failures (in otherwise independent components) that are not statistically independent.
Causes of Software Common Mode Failures:
Faults in Spec (propagated to all implementations)
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Common Mode Failures
Failures (in otherwise independent components) that are not statistically independent.
Causes of Software Common Mode Failures:
Faults in Spec (propagated to all implementations) Similarity of development languages and environments
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Common Mode Failures
Failures (in otherwise independent components) that are not statistically independent.
Causes of Software Common Mode Failures:
Faults in Spec (propagated to all implementations) Similarity of development languages and environments Similarity of algorithms
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Common Mode Failures
Failures (in otherwise independent components) that are not statistically independent.
Causes of Software Common Mode Failures:
Faults in Spec (propagated to all implementations) Similarity of development languages and environments Similarity of algorithms
Similar training to software engineering teams
Eliminating Common Mode Failures
By addressing their causes:
Faults in Spec (propagated to all implementations) Similarity of development languages and environments Similarity of algorithms
Similar training to software engineering teams Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Eliminating Common Mode Failures
By addressing their causes:
Faults in Spec (propagated to all implementations)
Spec animation, verification etc.
Similarity of development languages and environments Similarity of algorithms
Similar training to software engineering teams Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Eliminating Common Mode Failures
By addressing their causes:
Faults in Spec (propagated to all implementations)
Spec animation, verification etc.
Similarity of development languages and environments
Ensure diversity of dev environments and languages
Similarity of algorithms
Similar training to software engineering teams Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Eliminating Common Mode Failures
By addressing their causes:
Faults in Spec (propagated to all implementations)
Spec animation, verification etc.
Similarity of development languages and environments
Ensure diversity of dev environments and languages
Similarity of algorithms
Ensure teams remain (geographically) separated
Similar training to software engineering teams Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Eliminating Common Mode Failures
By addressing their causes:
Faults in Spec (propagated to all implementations)
Spec animation, verification etc.
Similarity of development languages and environments
Ensure diversity of dev environments and languages
Similarity of algorithms
Ensure teams remain (geographically) separated
Similar training to software engineering teams
Use teams from different organisations
Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Does it work?
Relatively little data either way (since expensive)
Academic Study (Knight and Leveson, IEEE TSE 1986) 27 (geographically separate) student teams
from U. VA and UC Irvine Implementing a simple missile defence system
in the same programming language NASA¡¯s version as test oracle
Ran 1 million tests against each looking for common failures
Does it work?
10 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Does it work?
10 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Independent implementations alone may not be enough to ensure sufficient diversity for fault tolerance
Recovery Blocks
11 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Acceptance Tests
Determines whether the version¡¯s output is correct or not (c.f. test oracles)
Sometimes a sanity check only
(e.g. based on previous history of values)
Should produce a tight but accurate range of acceptable values
12 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Acceptance Tests
Determines whether the version¡¯s output is correct or not (c.f. test oracles)
Sometimes a sanity check only
(e.g. based on previous history of values)
Should produce a tight but accurate range of acceptable values
12 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Alternate Versions
Require diversity, just like N-Version programming
13 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Alternate Versions
Require diversity, just like N-Version programming
Example: use the acceptance test as second version.
13 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
Why Recovery Blocks?
Especially if, like N-Version programming, it requires multiple diverse versions?
Alternate versions can be simpler, unlike with N-Version programming.
Increased design diversity in alternate versions
Increased reliability in alternate versions
14 Copyright University of Melbourne 2016, provided under Creative Commons Attribution License
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com