Synergising Program Analysis and Machine Learning for Program Repair

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Synergising Program Analysis and Machine Learning for Program Repair

Parasaram, Nikhil; (2024) Synergising Program Analysis and Machine Learning for Program Repair. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of Nikhil_s_Thesis__Corrections_-5.pdf]

Preview

Text
Nikhil_s_Thesis__Corrections_-5.pdf - Submitted Version
Download (1MB) | Preview

Abstract

Automated program repair (APR) enhances software quality by fixing bugs automatically, but it faces challenges due to software complexity. The vast number of possible patches makes exhaustive search impractical, and identifying correct patches is difficult since tools may generate incorrect fixes that overfit test cases. Mitigating these challenges involves leveraging the structure of code, which consists of a formal channel (execution semantics) and a natural language channel (comments, variable names). Machine learning excels at interpreting the natural language channel using large datasets but struggles with generating semantically correct patches. Conversely, program analysis provides detailed insights into program semantics. Combining program analysis with machine learning can address these challenges, using program analysis for execution specifics and machine learning for natural code aspects, like identifiers and comments. This thesis consists of four different works: • Chapter 3 advances semantic repair by synthesizing patches with side effects, employing symbolic execution with state merging and effective patch prioritization to repair bugs in open-source projects. • Chapter 4 reduces the search space by utilizing neural networks to learn variable information, ranking variables and patch templates to improve accuracy and reduce test overfitting. This enhances existing approaches, allowing them to repair previously unfixable bugs by leveraging program namespace information. • Chapter 5 leverages abstract interpretation and fuzzing to probabilistically approximate reachable program states, focusing on high-probability states PSP boosts performance in abstract interpretation, symbolic execution, and patch prioritization, benefiting strategies discussed in Chapter 3 and Chapter 4. • Chapter 6 addresses the challenge of identifying the most relevant facts, such as test errors and angelic values, for constructing effective prompts for LLM based APR. Extracted through program analysis, these facts build prompts whose effectiveness varies across different bugs. We develop a strategy to select facts tailored to each specific bug, significantly enhancing the effectiveness of LLMs in APR.

Type:	Thesis (Doctoral)
Qualification:	Ph.D
Title:	Synergising Program Analysis and Machine Learning for Program Repair
Open access status:	An open access version is available from UCL Discovery
Language:	English
Additional information:	Copyright © The Author 2024. Original content in this thesis is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) Licence (https://creativecommons.org/licenses/by-nc/4.0/). Any third-party copyright material present remains the property of its respective owner(s) and is licensed under its existing terms. Access may initially be restricted at the author’s request.
UCL classification:	UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI:	https://discovery.ucl.ac.uk/id/eprint/10198758

Downloads since deposit

127Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item