Software Testing Strategy
11.0 Strategic approach to software testing
Testing is a set of activities that can be planned in advance and conducted systematically. For this reason a template for software testing -- a set of steps into which we can place specific test case design techniques and testing methods -- should be defined for the software process.
A number of software testing strategies have been proposed in the literature. All provide the software developer with a template for testing and all have the following generic characteristics.
Ø Testing begins at the component level and works 'outward' toward the integration of the entire computer-based system,
Ø Different testing techniques are appropriate at different points in time.
Ø Testing is conducted by the developer of the software and (for large projects) an independent test group.
Ø Testing and debugging are different activities, but debugging must be accommodated in any testing strategy,
A strategy for software testing must accommodate low-level tests that are necessary to verify that a small source code segment has been correctly implemented as well as high-level tests that validate major system functions against customer requirements. A strategy must provide guidance for the practitioner and a set of milestones for the manager. Because the steps of the test strategy occur at a time when dead-line pressure begins to rise, progress must be measurable and problems must surface as early as possible.
11.1 Organizing for Software Testing
For every software project, there is an inherent conflict of interest that occurs as testing begins. The people who have built the software are now asked to test the software. This seems harmless in itself; after all, who knows the program better than its developers? Unfortunately, these same developers have a vested interest in demonstrating that the program is error free, that it works according to customer requirements, and that it will be completed on schedule and within budget. Each of these interests mitigate against thorough testing.
From a psychological point of view, software analysis and design (along with coding) are constructive tasks. The software engineer creates a computer program, its documentation, and related data structures. Like any builder, the software engineer is proud of the edifice that has been built and looks askance at anyone who attempts to tear it down. When testing commences, there is a subtle, yet definite, attempt to 'break' the thing that the software engineer has built. From the point of view of the builder, testing can be considered to be (psychologically) destructive.
There are often a number of misconceptions that can be erroneously inferred from the preceding discussion:
Ø That the developer of software should do no testing at all.
Ø That the software should be ‘tossed over the wall’ to strangers who will test it mercilessly,
Ø That tester gets involved with the project only when the testing steps are about to begin.
Each of these above statements is incorrect.
The software developer is always responsible for testing the individual units (components) of the program, ensuring that each performs the function for which it was designed. In many cases, the developer also conducts integration testing -- a testing step that leads to the construction (and test) of the complete program structure. Only after the software architecture is complete does an independent test group become involved.
The role of an independent test group (ITG) is to remove the inherent problems associated with letting the builder test the thing that has been built. Independent testing removes the conflict of interest that may otherwise be present. After all, personnel in the independent group team are paid to find errors.
However, the software engineer doesn’t turn the program over to ITG and walk away. The developer and the ITG work closely throughout a software project to ensure that thorough tests will be conducted: While testing is conducted, the developer must be available to correct errors that are uncovered.
The ITG is part of the software development project team in the sense that it becomes involved during the specification activity and stays involved (planning and specifying test procedures) throughout a large project. However, in many cases the ITG reports to the software quality assurance organization, thereby achieving a degree of independence that might not be possible if it were a part of the software engineering organization.
11.2 A software Testing Strategy
The software engineering process may be viewed as the spiral illustrated in figure below. Initially, system engineering defines the role of software and leads to software requirements analysis. Where the information domain, function, behavior, performance, constraints. and validation criieria for software are established. Moving inward along the spiral we come to design and finally to coding. To develop computer software, we spiral inward along streamlines that decrease the level of abstraction on each turn
A strategy for software testing may also be viewed in the context of the spiral (figure above). Unit testing begins at the vortex of the spiral and concentrates on each unit (i.e, component) of the software as implemented in source code. Testing progresses by moving outward along the spiral to integration testing, where the focus is on design and the construction of the software architecture. Taking another turn outward on the spiral, we encounter validation testing, where requirements established as part of software requirements analysis are validated against the software that has been constructed. Finally, we arrive at system testing, where the software and other system elements are tested as a whole. To test computer software, we spiral out along streamlines that broaden the scope of testing with each turn.
Considering the process from a procedural point of view, testing within the context of software engineering is actually a series of four steps that are implemented sequentially. The steps are shown in Figure below.
Initially, tests focus on each component individually, ensuring that it functions properly as a unit. Hence, the name unit testing. Unit testing makes heavy use of white-box testing techniques, exercising specific paths in a module's control structure to ensure complete coverage and maximum error detection. Next, components must be assembled or integrated to form the complete software package. Integration testing addresses the issues associated with the dual problems of verification and program construction. Black-box test case design techniques are the most prevalent during integration, although a limited amount of white-box testing may be used to ensure coverage of major control paths. After the software has been integrated (constructed), a set of high-order tests are conducted Validation criteria (established during requirements analysis) must be tested. Validation testing provides final assurance that software meets all functional, behavioral, and performance requirement. Black box testing techniques arc used exclusively during validation
The last high-order testing step falls outside the boundary of software engineering and into the broader context of computer system engineering. Software, once validated, must be combined with other system elements (e.g., hardware, people, and databases). System testing verifies that all elements mesh properly and that overall system function/performance is achieved.
11.3 Unit Testing
Unit testing focuses verification effort on the smallest unit of software design -- the software component or module. Using the component-level design description as a guide, important control paths are tested to uncover errors within the boundary of the module. The relative complexity of tests and uncovered errors is limited by the constrained scope established for unit testing. The unit test is white-box oriented, and the step can be conducted in parallel for multiple components.
11.3.1 Unit Test Considerations
The tests that occur as part of unit tests are illustrated schematically in Figure below. The module interface is tested to ensure that information properly flows into and out of the program unit under test. The local data structure is examined to ensure that data stored temporarily maintains its integrity during all steps in an algorithm's execution. Boundary conditions are tested to ensure that the module operates properly at boundaries established to limit or restrict processing. All independent paths (basis paths) through the control structure are exercised to ensure that all statements in a module have been executed at least once. And finally, all error handling paths are tested.
Tests of data flow across a module interface are required before any other test is initiated. If data do not enter and exit properly, all other tests are moot. In addition, local data structures should be exercised and the local impact on global data should be ascertained (if possible) during unit testing.
Selective testing of execution paths is an essential task during the unit test. Test cases should be designed to uncover errors due to erroneous computations, incorrect comparisons, and improper control flow. Basis path and loop testing are effective techniques for uncovering a broad array of path errors.
Among the more common errors in computation are :
Ø Misunderstood or incorrect arithmetic precedence,
Ø mixed mode operations,
Ø incorrect initialization
Ø precision inaccuracy
Ø Incorrect symbolic representation of an expression.
Test cases should uncover errors such as
Ø Comparison of different data types,
Ø incorrect logical operators or precedence,
Ø expectation of equality when precision error makes equality unlikely,
Ø incorrect comparison of variables,
Ø improper or nonexistent loop termination,
Ø failure to exit when divergent iteration is encountered, and
Ø Improperly modified loop variables.
11.3.2 Unit Test Procedures
Unit testing is normally considered as an adjunct to the coding step. After source level code has been developed, reviewed, and verified for correspondence to component-level design, unit test case design begins. A review of design information provides guidance for establishing test cases that are likely to uncover errors in each of the categories discussed earlier. Each test case should be coupled with a set of expected results.
Because a component is not a stand-alone program, driver and/or stub software must be developed for each unit test. The unit test environment is illustrated in figure below. In most applications a driver is nothing more than a 'main program' that accepts test case data, passes such data to the component. (to be tested), and prints relevant results. Stubs serve to replace modules that are subordinate (called by) the component to be tested. A stub or ‘dummy subprogram’ uses the subordinate modules’s interface, may do minimal data manipulation, prints verification of entry and returns control to the module undergoing testing.
Drivers and stubs represent overhead that is, both are software that must be developed to test the module but it will not be delivered to the customer. If the drivers and stubs are kept simple than overhead will be relatively low. Unfortunately, many components cannot be adequately tested with ‘simple’ overhead software. In such cases, complete testing can be postponed until the integration test step (where driver or stubs are also used).
Unit testing is simplified when a component with high cohesion is designed. When only one function is addressed by the component, the number or test cases is reduced and errors can be more easily predicted and uncovered.
11.4 Integration testing
A neophyte in the software world might ask a seemingly legitimate question once all modules have been unit tested: "If they all work individually, why do you doubt that they'll work when we put them together?" The problem, or course, is putting them together -- interfacing. Data can be lost across an interface; one module can have an inadvertent, adverse affect on another; sub-functions, when combined, may not produce the desired major function; individually acceptable imprecision may be magnified to unacceptable levels; global data structures can present problems. Sadly, the list goes on and on.
Integration testing is a systematic technique for constructing the program structure while at the same time conducting tests to uncover errors associated with interfacing. The objective is to take unit tested components and build a program structure that has been dictated by design.
There is often a tendency to attempt non-incremental integration; that is, to construct the program using a ‘big bang’ approaches. All components are combined in advance. The entire program is tested as a whole. And chaos usually results! A set of errors is encountered. Correction is difficult because isolation or causes is complicated by the vast expanse of the entire program. Once these errors are corrected, new ones appear and the process continues in a seemingly endless loop.
Incremental integration is the antithesis of the big bang approach. The program is constructed and tested in small increments, where errors are easier to isolate and correct; interfaces are more likely to be tested completely; and a systematic test approach may be applied. In the sections that follow, a number of different incremental integration strategies are discussed.
11.4.1 Top down Integration
Top down integration testing is an incremental approach to construction or program structure. Modules are integrated by moving downward through the control hierarchy beginning with the main control module (main program). Modules subordinate (and ultimately subordinate) to the main control module are incorporated into the structure in either a depth-first or breadth-first manner.
Referring to Figure below depth-first integration would integrate all components on a major control path of the structure. Selection of a major path is somewhat arbitrary and depends on application-specific characteristics. For example, selecting the left hand path, components M1 M2 and M5 would be integrated first. Next, M8 or (if necessary proper functioning of M2) M6 would be integrated. Then, the central and right hand control paths are built. Breadth first integration incorporates all components directly sub-ordinate at each level, moving across the structure horizontally. From the figure, components M2, M3 and M4 ( a replacement of stub s4) would be integrated first. The next control level M5, M6 and so on, follows.
The integration process is performed in a series of five steps:
1. The main control module is used as a test driver and stubs are substituted for all components directly sub-ordinate to the main control module.
2. Depending on the integration approach selected ( i.e., depth or breadth first), sub-ordinate stubs are replaced one at a time with actual components.
3. Tests are conducted as each component is integrated.
4. On completion of each set of tests, another stub is replaced with the real components.
5. Regression testing may be conducted to ensure that new errors have not been introduced.
The process continues from step 2 until the entire program structure is built.
The top-down integration strategy verifies major control or decision points early in the test process. In a well-factored program structure, decision making occurs at upper levels in the hierarchy and is therefore encountered first. If major control problems do exist, early recognition is essential. If depth-first integration is selected, a complete function of the software may be implemented and demonstrated.
Top-down strategy sounds relatively uncomplicated, but in practice, logistical problems can arise. The most common of these problems occurs when processing at low levels in the hierarchy is required to adequately test upper levels. Stubs replace low-level modules at the beginning of top-down testing; therefore, no significant data can now upward in the program structure. The tester is left will three choices;
Ø Delay many tests until stubs are replaced with actual modules.
Ø Develop stubs that perform limited functions that simulate the actual module, or
Ø Integrate the software from the bottom of the hierarchy upward.
The first approach (delay tests until stubs are replaced by actual modules) causes us to loose some control over correspondence between specific tests and incorporation of specific modules. This can lead to difficulty in determining the cause of errors and tends to violate the highly constrained nature of the top-down approach. The second approach is workable but can lead to significant overhead, as stubs become more and more complex. The third approach is called bottom-up testing.
11.4.2 Bottom-up Integration
Bottom-up integration testing, as its name implies, begins construction and testing with atomic modules (i.e., components at the lowest levels in the program structure). Because components are integrated from the bottom up, processing required for components subordinate to a given level is always available and the need for stubs is eliminated.
A bottom-up integration strategy may be implemented with the following steps:
Ø Low-level components are combined into clusters (sometimes called builds) that perform a specific software sub-function.
Ø A driver (a control program for testing) is written to coordinate test case input and output.
Ø The cluster is tested.
Ø Drivers are removed and clusters are combined moving upward in the program structure.
Integration follows the pattern illustrated in Figure below. Components are combined to form clusters 1,2, and 3.
Each of the clusters is tested using a driver (shown as a dashed block). Components in clusters 1 and 2 are subordinate to Ma. Drivers D1 and D2 are removed and the clusters are interfaced directly to Ma. Similarly, driver D3 for cluster 3 is removed prior to integration with module Mb. Both Ma and Mb will ultimately be integrated with component Mc, and so forth.
As integration moves upward the need for separate test drivers lessens. In fact, if the top two levels of program structure are integrated top down, the number of drivers can be reduced substantially and integration of clusters is greatly simplified.
11.4.3 Regression Testing
Each time a new module is added as part of integration testing, the software changes. New data flow paths are established, new I/O may occur, and new control logic is invoked. These changes may cause problems with functions that previously worked flawlessly. In the context of an integration test strategy, regression testing is the re-execution of some subset of tests that have already been conducted to ensure that changes have not propagated unintended side effects.
In a broader context, successful tests (of any kind) result in the discovery of errors, and errors must be corrected. Whenever software is corrected, some aspect of the software configuration (the program, its documentation, or the data that support it) is changed. Regression testing is the activity that hclp5 to en5ure that changes (due to testing or for other reasons) do not introduce unintended behavior or additional errors.
For instance, suppose you are going to add new functionality to your software, or you are going to modify a module to improve its response time. The changes, of course, may introduce errors into software that was previously correct. For example, suppose the program fragment
x := c + 1 ;
c := x + 2; x:= 3;
works properly. Now suppose that in a subsequent redesign it is transformed into
c := c + 3;
in an attempt at program optimization. This may result in an error if procedure proc accesses variable x.
Thus, we need to organize testing also with the purpose of verifying possible regressions of software during its life, i.e., degradations of correctness or other qualities due to later modifications. Properly designing and documenting test cases with the purpose of making tests repeatable, and using test generators, will help regression testing. Conversely, the use of interactive human input reduces repeatability and thus hampers regression testing.
Finally, we must treat test cases in much the same way as software. It is clear that such factors as resolvability, reusability, and verifiability are just as important in test cases as they are in software. We must apply formality and rigor and all of our other principles in the development and management of test cases.
11.4.4 Comments on Integration Testing
There has been much discussion of the relative advantages and disadvantages of top-down versus bottom-up integration testing. In general, the advantages of one strategy tend to result in disadvantages for the other strategy. The major disadvantage of the top-down approach is the need for stubs and the attendant testing difficulties that can be associated with them. Problems associated with stubs may be offset by the advantage of testing major control functions early. The major disadvantage of bottom-up integration is that the program as an entity does not exist until the last module is added. This drawback is tempered by easier test case design and a lack of stubs.
Selection of an integration strategy depends upon software characteristics and, sometimes, project schedule. In general, a combined approach (sometimes called sandwich testing) that uses top-down tests for upper levels of the program structure, coupled with bottom-up tests for subordinate levels may be the best compromise.
As integration testing is conducted, the tester should identify critical modules. A critical module has one or more of the following characteristics:
Ø addresses several software requirements,
Ø has a high level of control (resides relatively high in the program structure),
Ø is complex or error prone (cyclomatic complexity may be used as an indicator), or
Ø has definite performance requirements.
Ø Critical modules should be tested as early as is possible.
In addition, regression tests should focus on critical module function.
11.5 The art of debugging
Software testing is a process that can be systematically planned and specified. Test case design can be conducted, a strategy can be defined, and results can be evaluated against prescribed expectations.
Debugging occurs as a consequence or successful testing. That is, when a test case uncovers an error, debugging is the process that results in the removal or the error. Although debugging can and should be an orderly process, it is still very much an art. A software engineer, evaluating the results or a test, is often confronted with a "symptomatic" indication or a software problem. That is, the external manifestation or the error and the internal cause or the error may have no obvious relationship to one another. The poorly understood mental process that connects a symptom to a cause is debugging.
11.5.1 The Debugging Process
Debugging is not testing but always occurs as a consequence of testing. Referring to Figure in the next page, the debugging process begins with the execution or a test case. Results are assessed and a lack or correspondence between expected and actual performance is encountered. In many cases, the non-corresponding data are a symptom of an underlying cause as yet hidden. The debugging process attempts to match symptom with cause thereby leading to error correction.
The debugging process will always have one or two outcomes:
1. The cause will be found and corrected, or
2. The cause will not be found.
In the latter case, the person performing debugging may suspect a cause, design a test case to help validate that suspicion, and work toward error correction in an iterative fashion.
Why is debugging so difficult? In all likelihood, human psychology has more to do with an answer than software technology. However, a few characteristics or bugs provide some clues:
Ø The symptom and the cause may be geographically remote. That is, the symptom may appear in one part or a program, while the cause may actually be located at a site that is far removed. Highly coupled program structures exacerbate this situation.
Ø The symptom may disappear (temporarily) when another error is corrected.
Ø The symptom may actually be caused by non-errors (e.g., round-off inaccuracies).
Ø The symptom may be caused by human error that is not easily traced.
Ø The symptom may be a result, or timing problems, rather than processing problems.
Ø It may be difficult to accurately reproduce input conditions (e.g., a real-time application in which input ordering is indeterminate).
Ø The symptom may be intermittent. This is particularly common in embedded systems that couple hardware and software inextricably.
Ø The symptom may be due to causes that are distributed across a number of tasks running on different processors.
During debugging, we encounter errors that range from mildly annoying (e,g., an incorrect output format) to catastrophic (e.g. the system fails, causing serious economic or physical damage). As the consequences or an error increase, the amount of pressure to find the cause also increases. Often, pressure sometimes forces a sort- ware developer to fix one error and at the same time introduce two more.