Anti-Regression Approaches: Impact Analysis and Regression Testing Compared/Combined – Part II: Regression Prevention and Detection Using Static Techniques

In Part I of this article series, we looked at the nature of regression and impact analysis. In this article we reframe impact analysis as a regression prevention technique; we compare technical and business impact analysis a little more and we discuss regression prevention and detection using business impact analysis and static code analysis. The next article (Part III) will focus exclusively on regression testing as a regression detection approach.

Regression Prevention and Regression Detection

Before we go any further, it’s worth exploring the relationship between impact analyses (used to prevent regressions) and testing (used to detect regressions). We looked at impact analysis from both business and technical viewpoints, but we can also compare the pre-change activities of impact analysis to the post-change activities of testing.

  Technical Viewpoint
(Design- or Code-Based)
Business Viewpoint
(Behaviour-based)
Pre-Change Impact Analysis (regression prevention) Technical Impact Analysis
A manual analysis of the designs and source code to determine the potential impact of a change.
Business Impact Analysis
A speculation, based on current behaviour of the system and the business context, of the potential impact of a change.
Post-Change Testing (regression detection)

Static Regression Testing
An automated static analysis of the source code to identify analysis differences post-change.

Dynamic Regression Testing
Execution of a pre-existing dynamic test to compare new behaviour with previously trusted behaviour.

The table summarises the four anti-regression activities in a 2×2 matrix. There are some similarities between the business and technical impact analysis approaches. Both are based on a current understanding of the existing system (but at a technical or behavioural level depending on viewpoint). Both are somewhat speculative and focus on how regressions can be avoided or accommodated in the technology or in the business process./p>

The post-change techniques focus on how regressions can be detected. They are based on evidence derived from a post-change analysis of the design and code or a demonstration of the functionality implemented using a previously run set of tests.

A comprehensive anti-regression strategy should include all four of these techniques.

Technical Impact Analysis (Regression Prevention)

A technical impact analysis is essentially a review of the prospective changes to a system design at a high level and could take the form of a technical review. Where major enhancements are concerned and the changes to be made are mainly additions to an existing system, a technical review would focus on impact at an architectural level (performance, security, resilience risks). We won’t discuss this further.

At the code level, concerns would focus on new interfaces (direct integration) and changes to shared resources such as databases (indirect integration). Obviously, changes to be made to the existing code base, if they are known, need to be studied in some detail.

Some form of source code analysis needs to be performed by designers and programmers on the pre-change version of the software. The analysis is basically a code inspection or review. The developer speculates on what the impact of the changes could be and traces paths of execution through the code and through other unchanged modules to see what impact the changes could have. Design changes involving the database schema, messaging formats, call mechanisms etc. may require changes in many places. Simple search and scanning tools may help, but this is a labour intensive and error-prone activity of course.

In small code samples that are simple, well designed and have few interfacing modules, the developer will have a reasonable chance of identifying potential problems that are in the near vicinity of the proposed changes. Anomalies found can be eliminated by adjusting the design of the changes to be made (or by adopting a design that avoids risky changes). However, in realistically sized systems, the scale and complexity of this task severely limits its scope and effectiveness.

Some changes will be found by compilers or the build processes used by the developers so these are of less concern. The impact of more subtle changes may require some ‘detective work’. Typically, the programmer may need to perform searches of the entire code base to find specific code patterns. Typical examples might be accesses to a changed database table; usage of a new or changed element in an XML formatted message; usage of a variable that has a changed range of validity and so on.

Usually, formal static analysis on programme source code is performed using tools. These can’t be used to predict the impact on behaviour of changes (unless the changed code is analysed, and we’ll look at that next), but the output of tools could be used to focus the programmers’ attention on specific aspects of the system. Needless to say, a defect database used to identify error-prone modules in the code base is invaluable. These areas of special attention could be given more attention in a targeted inspection of the potential side-effects of changes.

Business Impact Analysis (Regression Prevention)

When additional functionality is required to be added to a new system or an enhancement to existing functionality is required then some form of business impact analysis, driven by an understanding of the proposed and existing behaviour, is required. Occasionally a bug fix requires a significant amount of redesign so a review of the functional areas to be changed and the functional areas that might be affected is in order.

The business impact analysis is really a set of ‘what-if?’ questions asked of the existing system before the proposed changes are made. The answers to those what-if questions may raise concerns about potential failure modes – the consequences – against which a risk-analysis could be conducted. Of course, the number of potential failure modes is huge and the knowledge available to analyse these risks may be very limited. However, the areas of most concern could be highlighted to the developers for them to pay special attention to and in particular to focus attention on subsequent regression testing.

A business impact analysis follows a fairly consistent process:

  1. PROPOSAL: Firstly, the proposed enhancement, amendment or bug fix resolution is described and communicated to the business users.
  2. CONSEQUENCE: The business users then consider the changes in functionality, and the technical changes that the programmers know will affect certain functional aspects of the system. Are there potential modes of failure of particular concern?
  3. CHALLENGE: Finally, the business users then challenge the programmers and ask, “what would happen if…?” questions to tease out and uncertainties and to highlight any specific regression tests that would be appropriate to address their concerns.

The main output from this process may be a set of statements the focus the attention of the developers. Often, specific feature or processes may be deemed critical and must not under any circumstances be adversely affected by changes. These feature and processes will surely be the subject of regression testing later.

Static Regression Testing (Regression Detection)

Can static analysis tools help with anti-regression? This section makes some tentative suggestions on how static analysis tools could be used to highlight changes that might be of concern. This is a rather speculative proposal. We would be very interested to hear from practitioners or researchers who are working in this area.

Tools are normally used to scan new or changed source code with a view to detecting poor programming practices and statically-detectable defects, so that developers can eliminate them. However, regressions are often found in unchanged code that calls or is called by the changed code. A scan of code in an unchanged module won’t tell you anything you didn’t already know. So, the analysis must look inside changed code and trace the paths affected by the changes to unchanged modules that call or are called by the changed module. Of course there may be extremely complex interactions for these tools to deal with – but that is exactly what the tools exist to do.

The process would look something like this:

  1. An analysis is performed on the unchanged code of the whole code-base or selected components of the system to be changed (the scope needs to be defined and be consistent in this process).
  2.  The same analysis is performed on the changed code-base.
  3. The two analyses are compared and differences highlighted to identify where the code changes have affected the structure and execution paths of the system.

Whereas a differencing tool would tell you what has changed in the code. Identifying the differences in the output of the static analyses may help you to locate where the code changes have an impact.

Tool can generate many types of analysis nowadays. What types of analyses might be of interest?

Control-flow analysis: if two control flow analyses of a changed system differ, but the differences are in unchanged code, then is it possible that some code is executed that was not used before, or some code that was used before is no longer executed. The new flow of control in the software may be intentional, but of course, it may not be. This analysis simply gives programmers a pointer to functionality and code paths that are worth further investigation. If the tool can generate graphical control-flow graphs, the changes may be obvious to a trained eye. The process could be analogous to a doctor examining x-rays taken at different times to look for growth of a tumour or healing of a broken bone.

Data Flow analysis: Data flow analysis traces the usage of variables in assignments, predicates in decisions (e.g. referenced in an if… then… else… statement) or the value of a variable is used in some other operation such as assignment to another variable or used in a calculation. A difference in the usage pattern of a variable defined in a changed module passed into or passed from unchanged modules may indicate an unwanted change in software behaviour.

Not enough organisations use static analysis tools and few tools are designed to search for differences in ‘deep-flow analysis’ outputs across versions of entire systems – but clearly, there are some potential opportunities worth exploring there. This article leaves you with some ideas but won’t take this suggestion further.

So far, we’ve looked at three techniques for preventing and detecting regressions. In the next article, we’ll examine the activity of more interest to programmers and testers – regression testing.