Fast and Flexible Large-Scale Clone Detection with CloneWorks

Author email: jeff.svajlenko@usask.ca
Tool name: CloneWorks
Description: Clone detection in very-large inter-project repositories has numerous applications in software research and development However, existing tools do not provide the flexibility researchers need to explore this emerging domain. We introduce CloneWorks, a fast and flexible clone detector for large-scale clone detection experiments. CloneWorks gives the user full control over the representation of the source code before clone detection, including easy plug-in of custom source transformation, normalization and filtering logic. The user can then perform targeted clone detection for any type or kind of clone of interest CloneWorks uses our fast and scalable partitioned partial indexes approach, which can handle any input size on an average workstation using input partitioning. CloneWorks can detect Type-3 clones in an input as large as 250 million lines of code in just four hours on an average workstation, with good recall and precision as measured by our BigCloneBench.
Bibtex: "@inproceedings{svajlenko2017fast, title={Fast and flexible large-scale clone detection with cloneworks}, author={Svajlenko, Jeffrey and Roy, Chanchal K}, booktitle={2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C)}, pages={27--30}, year={2017}, organization={IEEE} }"
Link to public pdf: https://dl.acm.org/citation.cfm?id=3098354
Link to tool webpage: https://jeffsvajlenko.weebly.com/cloneworks.html
Link to demo: https://www.youtube.com/watch?v=OK3Pbzvzwcs
Category: None
Tags: fast, scalable, code clone, code detection, flexible
Year and Conference: 2017, ICSE
Terms of use