UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Impact of Test Suite Coverage on Overfitting in Genetic Improvement of Software

Lim, M; Guizzo, G; Petke, J; (2020) Impact of Test Suite Coverage on Overfitting in Genetic Improvement of Software. In: International Symposium on Search Based Software Engineering SSBSE 2020: Search-Based Software Engineering. (pp. pp. 188-203). Springer Green open access

[img]
Preview
Text
mainRPS.pdf - Accepted version

Download (300kB) | Preview

Abstract

Genetic Improvement (GI) uses automated search to improve existing software. It can be used to improve runtime, energy consumption, fix bugs, and any other software property, provided that such property can be encoded into a fitness function. GI usually relies on testing to check whether the changes disrupt the intended functionality of the software, which makes test suites important artefacts for the overall success of GI. The objective of this work is to establish which characteristics of the test suites correlate with the effectiveness of GI. We hypothesise that different test suite properties may have different levels of correlation to the ratio between overfitting and non-overfitting patches generated by the GI algorithm. In order to test our hypothesis, we perform a set of experiments with automatically generated test suites using EvoSuite and 4 popular coverage criteria. We used these test suites as input to a GI process and collected the patches generated throughout such a process. We find that while test suite coverage has an impact on the ability of GI to produce correct patches, with branch coverage leading to least overfitting, the overfitting rate was still significant. We also compared automatically generated tests with manual, developer-written ones and found that while manual tests had lower coverage, the GI runs with manual tests led to less overfitting than in the case of automatically generated tests. Finally, we did not observe enough statistically significant correlations between the coverage metrics and overfitting ratios of patches, i.e., the coverage of test suites cannot be used as a linear predictor for the level of overfitting of the generated patches.

Type: Proceedings paper
Title: Impact of Test Suite Coverage on Overfitting in Genetic Improvement of Software
Event: International Symposium on Search-Based Software Engineering
Open access status: An open access version is available from UCL Discovery
DOI: 10.1007/978-3-030-59762-7_14
Publisher version: https://doi.org/10.1007/978-3-030-59762-7_14
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Genetic Improvement, Search-based software engineering, Overfitting
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10106433
Downloads since deposit
13Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item