Anti-Regression Approaches: Impact Analysis and Regression Testing Compared and Combined – Part I: Introduction and Impact Analysis

Introduction

For some years, I’ve avoided getting too involved in test execution automation because I’ve felt it was boring. Yes, I know it has great promise and in principle, surely we should be offloading the tedious, repetitive, clerical tasks to tools. But I think the principles of regression testing haven’t changed and we’ve made little progress in the last fifteen years or so – we haven’t really moved on. It’s stuck in a time-warp.

I presented a talk on test automation at Eurostar 1997. Titled “Testing GUI Applications”, the paper I wrote is still the most popular one on my website gerrardconsulting.com with around 300 downloads a month. Why are people still interested in stuff I wrote so long ago? I think it was a good, but not groundbreaking paper; it didn’t mention the web; the recommendations for test automation were sensible, not radical. Books on the subject have been written since then. I’ve been meaning to update the paper for the new, connected world we now inhabit, but haven’t had the time so far.

But at the January 2010 Test Management Summit, I chose to Facilitate the session, “Regression Testing: What to Automate and How”. In our build-up to the Summit the topic came top of the popularity survey. We had to incorporate it into the programme, but no one volunteered – so I picked it up. On the day, the frustrations I’ve held for a long time came pouring out and the talk became a rather angry rant. In this series of articles, I want to set out the thoughts I presented at the Summit.

I’m going to re-trace our early steps to automation and try and figure out why we (the testers) are still finding that building and running sustainable, meaningful automated regression test suites is fraught with difficulties. Perhaps these difficulties arise because we didn’t think it through at the beginning?

Regression tests are the most likely to be stable and run repeatedly so automation promises big time savings and, being automated, guarantees reliable execution and results checking. The automation choice seems clear. But hold on a minute!

We regression test because things change. Chaotic, unstable environments need regression testing the most. But when things change regularly, automation is very hard. And this describes one of the paradoxes of testing. The development environments that need and would benefit from automated regression testing are the environments that find it hardest to implement.

Rethinking Regression

What is the regression testing thought process? In this paper, I want to step through the thinking associated with regression testing. To understand why regressions occur, to establish what we mean by regression testing, why we choose to do it and automate it.

How do regressions occur?

Essentially, something changes and this impacts ‘working’ software. This could be the environment in which the software operates, an enhancement is implemented or a bug is fixed (and the change causes side-effects) and so on. It’s been said over many years that software code fixes have a 50% chance of introducing side-effects in working software. Is it 30% or 80%? Who cares? Change is dangerous; the probability of disaster is unpredictable; we have all suffered over the years.

Regressions have a disproportionate impact on rework effort, confidence and even morale. What can we do? The two approaches at our disposal are impact analysis (to support sensible design choices) to prevent regressions and regression testing – to identify regressions when they occur.

Impact Analysis

In assessing the potential damage that change can cause, the obvious choice is to not change anything at all. This isn’t as stupid a statement as it sounds. Occasionally, the value of making a change, fixing a bug, adding a feature is far outweighed by the risk of introducing new, unpredictable problems. All prospective changes need to be assessed for their potential impact on existing code and the likelihood of introducing unwanted side-effects. The problem is that assessing the risk of change – Impact Analysis – is extremely difficult.
There are two viewpoints for impact analysis: The business view and the technical view.

Impact Analysis: Business View

The first is the user or business view: the prospective changes are examined to see whether they will impact the functionality of the system in ways that the user can recognise and approve of. Three types of functionality impact are common: business- or data- or process-impacted functionality.

Business-impacts often cause subtle changes in the behaviour of systems. An example might be where a change affects how a piece of data is interpreted: the price of an asset might be calculated dynamically rather than fixed for the lifetime of the asset. An asset stored at a location at one price, might be moved to another location at another price. Suddenly – the value of the non-existent asset at the first location is positive or even negative! How can that be? The software worked perfectly – but the business impact wasn’t thought through.

A typical data-impact would be where a data item required to complete a transaction is made mandatory, rather than optional. It may be that the current users rely on the data item being optional because at the time they execute the affected transaction the information is not known, but captured later. The ‘enhanced’ system might stop all business transactions going ahead or force the users to invent data to bypass the data validation check. Either way, the impact is negative.

Process-impacted functionality is where a change might affect the choices of paths through the system or through the business process itself. The change might for example cause a downstream system feature to be invoked where before it was not. Alternatively, a change might suppress the use of a feature that users were familiar with. Users might find they have to do unnecessary work or they have lost the opportunity to make some essential adjustment to a transaction. Wrong!

Impact Analysis: Technical View

With regards to the technical impact analysis by designers or programmers – there are a range of possibilities and in some technical environments, there are tools that can help. Very broadly, impact analysis is performed at two levels: top-down and bottom-up.

The top down analysis involves the consideration of the alternate design options and looking at their impact on the overall behaviour of the changed system. To fix a bug, enhance the functionality or meet some new or changed requirement, there may be alternative change designs to achieve these goals. A top-down approach looks at these prospective changes in the context of the architecture as a whole, the design principles and the practicalities of making the changes themselves. This approach requires that the designers/developers have an architectural view, but also a set of design principles or guidelines that steer designers away from bad practices. Unfortunately, few organisations have such a view or have design principles so embedded within their teams that they can rely on them.

The bottom-up analysis is code-driven. If the selected design approach impacts a known set of components that will change, the software that calls and depends upon the to-be-changed components can be traced. The higher-level services and features that ultimately depend on the changes can be identified and assessed. This sounds good in principle, especially if you have tools to generate call-trees and collaboration diagrams from code. But there are two common problems here.

The first problem is that the design integrity of the system as a whole may be poor. The dependencies between changed components and those affected by the changes may be numerous. If the code is badly structured, convoluted and looks like ‘spaghetti’, even the software experts may not be able to fathom this complexity and it can seem as though every part of the system is affected. This is a scary prospect.

The second problem is that the software changes may be at such a low level in the hierarchy of calling components that it is impractical to trace the impact of changes through to the higher level features. Although a changed component may be buried deep in the architecture, the effect of a poorly implemented software change may be catastrophic. You may know that a higher level service depends on a lower level component – the problem is, you cannot figure out what that dependency is to predict and assess the impact of the proposed change.

All in all, Impact analysis is a tricky prospect. Can regression testing get us out of this hole?

To be continued…

Paul Gerrard
29 March 2010.

Regression Testing – What to Automate and How

This was one of two presentations at the fourth Test Management Summit in London on 27 January 2010.

I don’t normally preach about test auotmation as, quite frankly, I find the subject boring. The last time I talked about automation was maybe ten years ago. This was the most popular topic in the session popularity survey we ran a few days before the event and it was very well attended.

The powerpoint slides were written on the morning of the event while other sessions weere taking place. It would seem that a lot of deep-seated frustrations with regression testing and automation came to the fore. The session itself became something of a personal rant and caused quite a stir.

The slides have been amended a little to include some of the ad-hoc sentiments I expressed in the session and also to clarify some of the messages that came ‘from the heart’.

I hope you find it interesting and/or useful.

Regression Testing – What to Automate and How

Automation below the GUI

A couple of weeks ago, after the BCS SIGiST meeting I was chatting to Martin Jamieson (of BT) about tools that test ‘beneath the GUI’. A while later, he emailed a question…

At the recent SIGIST Johanna Rothman remarked that automation should be done below the level of the GUI. You then stood up and said you’re working on a tool to do this. I was explaining to Duncan Brigginshaw (www.odin.co.uk) yesterday that things are much more likely to change at the UI level than at the API level. I gave him your example of programs deliberately changing the html in order to prevent hacking – is that right? However, Duncan tells me that he recommends automating at the UI. He says that the commercial tools have ways of capturing the inputs which shield the tester from future changes e.g. as objects. I think you’ve had a discussion with Duncan and am just wondering what your views are. Is it necessary to understand the presentation layers for example?

Ulitimately, all GUI interfaces need testing of course. The rendering and presentation of HTML, the execution of Javascript, ActiveX and Java objects obviously need a browser involved for a user to validate their behaviour. But Java/ActiveX can be tested through drivers written by programmers (and many are).

Typically, Javascript isn’t directly accessible to GUI tools anyway (as it is typically used for field validation and manipulation, screen formatting and window management). One can write whole applications in JavaScript if you wish.

But note that I’m saying that a browser is essential for a user to validate layout and presentation. If you go down the route of using a tool to automate testing of the entire application from GUI through to server based code, you need quite sophisticated tools, with difficult to use scripting (programming) languages. And lo and behold, to make these tools more usable/accessible to non-programmers, you need tools like AXE to reduce (sometimes dramatically) the complexity of the scripting language required to drive automated tests.

Now, one of the huge benefits of these kinds of testing frameworks, coupled with ‘traditional’ GUI test tools is they allow less technical testers to create, manage and execute automated tests. But, if you were to buy a Mercury WinRunner or QTP license plus an AXE licence, you’d be paying 6k or 7k PER SEAT, before discounts. This is hugely expensive if you think about what most automated tools are actually used for – compared with a free tool that can execute tests of server-based code directly.

Most automated tools are used to automate regression tests. Full stop. I’ve hardly ever met a system tester who actually set out to find bugs with tools. (I know ‘top US consultants’ talk about such people, but they seem to exist as a small minority. What usually happens is that the tester needs to get a regression test together. Manual tests are run, when the software is stable and tests pass, they get handed over to the automation folk. I know, I know, that AXE and tools like it allow testers to create automated test runs. However, with buggy software, you never get past the first run of a new test. So much for running the other 99 using the tool – why bother.

Until you can run the other 99, you don’t know whether they’ll find bugs anyway. So folk resort to running them manually because you need a human being checking results and anomalies, not a dumb tool. The other angle is that most bugs aren’t what you expect – by definition. e.g. checking a calculation result might be useful, but the tab order, screen validation, navigation, window management/consistency, usability and accessibility AREN’T in your preprared test plan anyway. So much for finding bugs proactively using automation. (Although bear in mind that there are free/cheap tools exist to validate HTML, accessibility, navigation and validation).

And after all this, be reminded, the calculated field is actually generated by the server based code. The expected result is a simple number, state variable or message. The position, font, font size and 57 other attributes of the field it appears in are completely irrelevant to the test case. The automated tool is, in effect, instructed by the framework tool to ignore these things, and focus on the tester’s predicted result.

It’s interesting (to me anyway) that the paper that is most downloaded from the Gerrard Consulting website is my paper on GUI testing. gerrardconsulting.com/GUI/TestGui.html It was written in 1997. It gets downloaded between 150 and 250 times a month approximately. Why is that for heavens sake – it’s nine years old! The web isn’t even mentioned in the paper! I can only think people are obsessed with the GUI and haven’t got a good understanding of how you ‘divide and conquer’ the complex task of testing a GUI into simpler tasks. Some that can be automated beneath the GUI, some that can be automated using tools other than GUI test running tools, some that can be automated using GUI test running tools and some that just can’t be automated. I’m sure most folk with tools are struggling to meet higher than realistic expectations.

So, what we are left with is an extremely complex product (the browser), being tested by a comparably (probably more) complex product, being controlled by another complex product to make the creation, execution and evaluation of tests of (mainly) server-based software an easy task. Although it isn’t of course. Frameworks work best with pretty standard websites or GUI apps with standard technologies. Once you go off the beaten track, the browser vendor, the GUI tool vendor and the framework vendor all need to work hard to make their tools compatible. But all must stick to the HTTP 2.0 protocol which is 10 years(?) old. How many projects set themselves up to use bleeding edge technology, and screw the testers as a consequence? Most, I think.

So. There we have it. If you are fool enough to spend £4-5,000 per seat on a GUI tool. You then to be smart enough to spend another £2,000 or so on a Framework (PER USER).

Consider the alternative.

Suppose you knew a little about HTML/HTTP etc. Suppose you had a tool that allowed you to get web pages, interpret the HTML, insert values to fields, submit the form, execute the server based form handler, receive the generated form, validate the form in terms of new field values, save copies of the received forms on your PC, compare those forms with previously received forms, and deal with the vagaries of secure HTTPS, and ignore the complexities of the user interface. The tool could have a simple script language, based on keywords/commands, stored in CSV files, managed by Excel.

If the tool could scan a form, not yet tested, and generate the script code to set the values for each field in the form, you’d have a basic but effective script capture facility. Cut and paste into your CSV file and you have a pretty effective tool. Capture the form map (not gui – you don’t need all that complexity, of course) and use the code to drive new test transactions.

That’s all pretty easy. The tool I’ve built does 75-80% of this now. My old Systeme Evolutif wesite (including online training) had 17,000 pages, with around 100 Active Server Pages script files. As far as I know, there’s nothing the tool cannot test in those 17,000 pages. Of course most are relatively simple. But they are only simple in that they use a single technology. There’s thousands of lines of server-base code. If/as/when I created a regression test pack for the site, I can (because the tool is run on the command line) run that test every hour against the live site. (Try doing that with QTP). If there is a single discrepancy in the HTML that is returned, the tool would spot it of course. I don’t need to use the GUI to do that. (One has to assume the GUI/browser behaves reliably though).

Beyond that, a regression test based on the GUI appearance would never spot things in the HTML unless you wrote code specifically to do that. Programmers often place data in hidden fields. By definitition, hidden fields never appear on the GUI. GUI tools would never spot a problem – unless you wrote code to validate HTML specifically. Regression tests focus on results generated by server-based code. Requirements specify outcomes that usually do not involve the user interface. In most cases, the user interface is entirely irrelevant to the successful outcome of a functional test. So, a test tool that validates the HTML content is actually better than a GUI tool (please note). By the way, GUI tools don’t usually have very good partial matching facilitites. With code-based tools, you can use regular expressions (Regexs). Much better control for the tester then GI tools.

Finally. If you use a tool to validate returned messages/HTML, you can get the programmer to write code that syncs with the test tool. A GUI with testability! For example, the programmer can work with the tester to provide the ‘expected result’ in hidden fields. Encrypt them if you must. The developer can ‘communicate’ directly with the tester. This is impossible if you focus on the GUI. It’s really quite hard to pass technical messages in the GUI without the user being aware.

So. A tool that drives server-based code is more useful (to programmers in particular because you don’t have the unnecessary complexities of the GUI). They work directly on the functionality to be tested – the server based code. They are simpler to use. They are faster (there’s no browser/GUI and test tool in the way). They are free. AND they are more effective in many (more than 50%?) cases.

Where such a tool COULD be used effectively, who in their right mind would choose to spend £6,000-7,000 per tester on LESS EFFECTIVE products?’

Oh, and did I say, the same tool could test all the web protocols MAIL, FTP etc. and could easily be enhanced to cover web sevices (SOAP, WSGI blah blah etc.) – the next big thing – but actually services WITHOUT a user interface! Please don’t get me wrong, I’m definitely not saying that GUI automation is a waste of time!’

In anything but really simple environments, you have to do GUI automation to achieve coverage (whatever that means) of an application. However, there are aspects of the underlying functionality that can be tested beneath the GUI and sometimes it can be more effective do do that but only IF there aren’t complicated technical issues in the way (that would be hidden behind the GUI and the GUI tool ignores them).

What’s missing in all this is a general method that guides testers to using manual, automation above or below the GUI. Have you ever seen anything like that? One of the main reasons people get into trouble with automation is because they have too high expectations and are overambitious. It’s the old 80/20 rule. 20% of functionality dominates the testing (but could be automated). Too often, people try and automate everything. Then 80% of the automation effort goes on fixing the tool to run tests of the least important 20% of tests. Or something like that. You know what I mean.
The beauty of frameworks is they hide the automation implementation details from the tester. Wouldn’t it be nice if the framework SELECTED the optimum automation method as well? I guess this should depend on the objective of a test. If the test objective doesn’t require use of the GUI – don’t use the GUI tool! Current frameworks have ‘modes’ based on the interfaces to the tools. Either they do GUI stuff, or they do Webservices stuff or… But a framework ought to be able to deal with gui, under the gui, web services, command-line stuff etc. etc. Just a thought.

I feel a paper coming on. Maybe I should update the 1997 article I wrote!

Thanks for your patience and trigging some thoughts. Writing the email was an interesting way to spend a couple hours, sat in a dreary hotel room/pub.

Posted by Paul Gerrard on July 4, 2006 03:08 PM

Comments

Good points. In my experience the GUI does change more often the the underlying API.
But often, using the ability of LR to record transactions is still quicker than hoping I’ve reverse-engineered the API correctly. More than once I’ve had to do it without any help fom developers or architects. 😉

Chris http://amateureconblog.blogspot.com/

Paul responds:

Thanks for that. I’m interested to hear you mention LR (I assume you mean Load Runner). Load Runner can obviously be used as an under the bonnet test tool. And quite effective it is too. But one of the reasons for going under the bonnet is to make life simpler, and as a consequence, a LOT cheaper.

There are plenty of free tools (and scripting languages with neat features) that can be perhaps just as effective as LR in executing basic transactions – and that’s the point. Why pay for incredibly sophisticated tools that compensate for each other, when a free simple tool can give you 60, 70, 80% of what you need as a functional tester?

Now LR provides the facilitites, but I wouldn’t recommend LR as a cheap tool! What’s the going rate for an LR license nowadays? $20k, $30k?

Thanks. Paul.