UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Mining Semantic Loop Idioms

Allamanis, M; Barr, ET; Bird, C; Devanbu, P; Marron, M; Sutton, C; (2018) Mining Semantic Loop Idioms. IEEE Transactions on Software Engineering , 44 (7) pp. 651-668. 10.1109/TSE.2018.2832048. Green open access

[thumbnail of Barr_coils.pdf]
Preview
Text
Barr_coils.pdf - Accepted version

Download (6MB) | Preview

Abstract

To write code, developers stitch together patterns, like API protocols or data structure traversals. Discovering these patterns can identify inconsistencies in code or opportunities to replace these patterns with an API or a language construct. We present coiling, a technique for automatically mining code for semantic idioms: surprisingly probable, semantic patterns. We specialize coiling for loop idioms, semantic idioms of loops. First, we show that automatically identifiable patterns exist, in great numbers, with a largescale empirical study of loops over 25MLOC. We find that most loops in this corpus are simple and predictable: 90 percent have fewer than 15LOC and 90 percent have no nesting and very simple control. Encouraged by this result, we then mine loop idioms over a second, buildable corpus. Over this corpus, we show that only 50 loop idioms cover 50 percent of the concrete loops. Our framework opens the door to data-driven tool and language design, discovering opportunities to introduce new API calls and language constructs. Loop idioms show that LINQ would benefit from an Enumerate operator. This can be confirmed by the exitence of a StackOverflow question with 542k views that requests precisely this feature.

Type: Article
Title: Mining Semantic Loop Idioms
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/TSE.2018.2832048
Publisher version: https://doi.org/10.1109/TSE.2018.2832048
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Semantics, Tools, Syntactics, Data mining, C# languages, Machine learning, Testing, Data-driven tool design, idiom mining, code patterns
UCL classification: UCL
UCL > Provost and Vice Provost Offices
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
URI: https://discovery.ucl.ac.uk/id/eprint/10062734
Downloads since deposit
0Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item