Evaluation of Regression Test Optimization Strategies

Abstract

Regression testing is an important part of software development. Though it helps to uncover bugs, long running test suites prevent short feedback loops and developers get distracted by other tasks in the meantime. The consequences are increasing costs for software development. This is why the regression testing process needs to be optimized. One possibility to do so are Regression Test Optimization (RTO) techniques. These select or prioritize tests by some criterion. In general the goal is to find failing tests faster. However, RTO is rarely applied in practice. On the one hand this may be related to the lack of tool support, but on the other hand research so far mostly evaluated RTO techniques in artificial setups. Thus, it is unclear whether RTO provides a benefit in real-world projects.

This thesis presents an evaluation of a selection and a prioritization technique on open-source projects with faults that were detected in the past. The selection technique selects tests that cover changed parts of the source code. As prioritization technique the well studied additional coverage technique is chosen. It sorts tests by the additionally covered code. Additionally, a technique that uses information from the last test run to prioritize tests is implemented. This technique is not evaluated due to a lack of appropriate data.

Three open-source projects with in total 165 versions are used for the evaluation. The selection technique excluded on average 91.9% of the tests, which lead to a time saving of 89% compared to running all tests. All selections contained the fault revealing tests. However, the evaluation of the prioritization technique yielded mixed results. These results are compared to evaluations found in the literature. Finally, possible improvements of the evaluation process are suggested.