Small code clone detection tool. It implements an algorithm from SourcererCC with adaptive prefix filtering optimizations and displays its results as HTML.
It works with JavaScript
, Python
, Java
, Go
, C++
, PHP
, C#
, C
, Swift
, Kotlin
and Haskell
.
potator
supports Linux and macOS. It is possible to use potator
on Windows under WSL
potator
can be installed using pip
pip install potator
git clone https://github.com/otzhora/potator
cd potator
./install.sh
potator [-h] [-d {Naive,Filtering}] [--depth DEPTH] [-t THRESHOLD] [-g GRANULARITY] [-o OUT] directory
- You can choose one of two detectors:
Naive
andFiltering
.Naive detector
compares every possible combination of source code fragments and calculates Jaccard similarity between them.Filtering detector
implements algorithm fromSourcererCC
paper with anadaptive prefix filtering
optimizations. depth
parameters specify the maximum depth of adaptive prefix.depth=2
is recommended. Since it offers the optimal balance between costs of building index and querying it.threshold
is the minimum score that two code fragments should have to be considered clones.granularity
specifies granularity of code blocks. Options arefunctions
andclasses
.functions
is recommended.out
specifies the name of the resulting htmldirectory
is the directory with files on which to perform search.
You can also do export DEBUG=1
before the search, then profiling information will be printed out.
You can import detectors or entities extractor from potator
and use them to work with source code.
>>> from potator.detectors import FilteringDetector
>>> detector = FilteringDetector()
>>> detector.detect(directory, thershold, granularity)
>>> from potator.extractors import EntitiesExtractor
>>> EntitiesExtractor.extract_data_from_directory(directory, granularity)