UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Learn to automate GUI tasks from demonstration

Intharah, Thanapong; (2018) Learn to automate GUI tasks from demonstration. Doctoral thesis (Ph.D), UCL (University College London). Green open access

[thumbnail of ThesisFinalized_reduced.pdf]
Preview
Text
ThesisFinalized_reduced.pdf - Accepted Version

Download (3MB) | Preview

Abstract

This thesis explores and extends Computer Vision applications in the context of Graphical User Interface (GUI) environments to address the challenges of Programming by Demonstration (PbD). The challenges are explored in PbD which could be addressed through innovations in Computer Vision, when GUIs are treated as an application domain, analogous to automotive or factory settings. Existing PbD systems were restricted by domain applications or special application interfaces. Although they use the term Demonstration, the systems did not actually see what the user performs. Rather they listen to the demonstrations through internal communications via operating system. Machine Vision and Human in the Loop Machine Learning are used to circumvent many restrictions, allowing the PbD system to watch the demonstration like another human observer would. This thesis will demonstrate that our prototype PbD systems allow non-programmer users to easily create their own automation scripts for their repetitive and looping tasks. Our PbD systems take their input from sequences of screenshots, and sometimes from easily available keyboard and mouse sniffer software. It will also be shown that the problem of inconsistent human demonstration can be remedied with our proposed Human in the Loop Computer Vision techniques. Lastly, the problem is extended to learn from demonstration videos. Due to the sheer complexity of computer desktop GUI manipulation videos, attention is focused on the domain of video game environments. The initial studies illustrate that it is possible to teach a computer to watch gameplay videos and to estimate what buttons the user pressed.

Type: Thesis (Doctoral)
Qualification: Ph.D
Title: Learn to automate GUI tasks from demonstration
Event: UCL (University College London)
Open access status: An open access version is available from UCL Discovery
Language: English
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10057533
Downloads since deposit
351Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item