TY  - GEN
T3  - International Workshop on Software Clones
A1  - Ragkhitwetsagul, C
A1  - Krinke, J
A1  - Marnette, B
CY  - Campobasso, Italy
AV  - public
ID  - discovery10046744
N1  - This version is the author accepted manuscript. For information on re-use, please refer to the publisher?s terms and conditions.
TI  - A Picture Is Worth a Thousand Words: Code Clone Detection Based on Image Similarity
PB  - IEEE
N2  - This paper introduces a new code clone detection
technique based on image similarity. The technique captures
visual perception of code seen by humans in an IDE by applying
syntax highlighting and images conversion on raw source code
text. We compared two similarity measures, Jaccard and earth
mover?s distance (EMD) for our image-based code clone detection
technique. Jaccard similarity offered better detection performance
than EMD. The F1 score of our technique on detecting
Java clones with pervasive code modifications is comparable
to five well-known code clone detectors: CCFinderX, Deckard,
iClones, NiCad, and Simian. A Gaussian blur filter is chosen as a
normalisation technique for type-2 and type-3 clones. We found
that blurring code images before similarity computation resulted
in higher precision and recall. The detection performance after
including the blur filter increased by 1 to 6 percent. The manual
investigation of clone pairs in three software systems revealed that
our technique, while it missed some of the true clones, could also
detect additional true clone pairs missed by NiCad.
Y1  - 2018/03/20/
UR  - http://doi.org/10.1109/IWSC.2018.8327318
ER  -