7CCSMASE
Tutorial 2 Solution
1. One can imagine some advantages for “just-in-time” test design. For example, it may be easier to design test cases for which oracles and other test scaffolding are easy to construct. It will place a great deal of test design at the end of a development cycle, where it slows product delivery and places tremendous pressure on test designers to rush and short-cut their work. Overall, it is likely to be a disaster for project schedule, or quality, or both.
2. A test case is designed (and inspected) once, but it is likely to be executed many times. Scaffolding, likewise, is used each time the tests are re-executed. Since re-inspection is much more expensive than re-executing test cases, it is likely that tests will be re-executed much more often than inspection is repeated.
3.
• “100 C is the boiling point of water” is a meaningful. Degrees C is an interval scale
since each degree is equal but the zero is arbitrary. Transformation to Fahrenheit scale is example of an affine transformation which is valid for interval scales.
• “Today is twice as hot as yesterday” is unlikely to be meaningful as weather temperature is measured in C or F which are not ratio scales (which would be required for “twice” to be meaningful). Differences are meaningful however, so if today is 10 degrees C, and yesterday was 5 degrees C, then it would be meaningful to say that today is 5 degrees hotter than yesterday.
• Depending on its precise definition, Line of Code is probably an absolute scale so the sentence is meaningful. Transformations which convert between lines of code measures for different languages may be empirically substantiated but are not valid for an absolute scale.
4. The requirement is not verifiable because it is expressed referring to a subjective sensation (”annoy”), which cannot be measured without referring to actual users. A simple way to turn the requirement into a verifiable one would be to bind the overall start-up time to a value small enough to avoid users to get annoyed (e.g., 5 seconds). Unfortunately, an upper bound too small may be impossible to satisfy without violating technology and budget constraints, while an upper bound too high may violate the initial requirement.
A better way of formulating the requirement in a verifiable (and satisfiable) way would be to require (1) the identification of a minimum set of functionalities that can be initialized within a short upper bound (e.g., 5 seconds), and that can allow users to start working in quick- mode while to operating is still completing the start-up, and (2) a set of functions that inform the users about the status of the start-up and the time still needed for the system to be operating in quick- and full-mode.
5. To be correct, software must behave correctly in all circumstances allowed by its specification. A single incorrect behaviour makes the program incorrect, even if that incorrect behaviour may never be observed in actual use. For example, suppose a transaction-logging subsystem of an automatic bank teller machine (ATM) works correctly
only when each individual withdrawal is less than $5000 USD. It might be 100% reliable in its current environment, because withdrawals are normally limited to much less than $5000 USD, but it could still be incorrect if this limitation is not part of its specification. An implication is that reliability, being relative to use, is also subject to change when the way it is used changes. If a round of currency devaluation makes it necessary to permit individual withdrawals greater than $5000 USD, the previously reliable logging module could become unreliable.
6. Since correctness is relative to a specification, a system can be correct but unsafe due to oversights in the specification. Requirements specification, including hazard analysis, plays an even more critical role in safety-critical software than in other kinds of software. For example; in a traffic control system, if you don’t specify what happens in cases of power outages, then the system might be perfectly correct with respect to the specification, yet to be stuck in the state where all traffic lights show green when there is a power outage – which is extremely unsafe, of course.
7. Separate development and quality teams are common in large organizations. Typically, unit testing is allocated in all companies to the development team, as it requires detailed knowledge of the code. Integration, system, and acceptance testing are typically done by the separate quality team, with the development team producing oracles (references to correctness of specific tests) and scaffolding. Inspection is typically done by the development teams or in mixed teams. Regression testing is done by the quality teams.
Some methodologies, such as extreme programming, postulate that there should not be different teams for development and for quality. A median option would be to introduce mobility of people and roles by rotating engineers over development and testing tasks among different projects (this is actually very common).
A common problem with having separate teams can arise due to the rewarding mechanisms. If the productivity of the development team is measured by LOC per person month, and the productivity of the quality team is measured by the number of bugs they found, then the development team will try to maximise productivity, without considering quality, and the quality team will therefore find more and more errors, so the overall picture will look great, where in reality the code will not become better (in fact, this is a recipe for having a bad quality code and failing the project). An advantage of having separate teams is that the search for errors will not be driven by the implementation specifics, but rather by the requirements.
If we have the development team being responsible for both development and quality control, then the problem above is solved (indeed, the development team will take care not to produce bad quality code, because they also need to test it), but the team may delay testing for development without leaving enough resources for testing (again, a very common problem), with the result of delivering a product that is not fully tested and overall project failure. An additional disadvantage is that the developers will test for errors in places they are suspicious of, based on their knowledge of the implementation, hence likely forgetting to test some code with complex functionality, which was not well understood.