diff --git a/README.md b/README.md index 4a91ebc..37ed7aa 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ Assuming you have [Scikit-Learn](https://scikit-learn.org/) already installed, y ```python from sklearn import svm from stringkernels.kernels import string_kernel -model = svm.SVC(kernel=string_kernel) +model = svm.SVC(kernel=string_kernel()) ``` and the polynomial string kernel, @@ -36,9 +36,9 @@ and the polynomial string kernel, ```python from sklearn import svm from stringkernels.kernels import polynomial_string_kernel -model = svm.SVC(kernel=polynomial_string_kernel) +model = svm.SVC(kernel=polynomial_string_kernel()) ``` -See the notebook [example.ipynb](https://github.com/weekend37/string-kernels/blob/master/example.ipynb) for further demonstration of usage. +For morer information read the [docs](https://github.com/weekend37/string-kernels/blob/master/doc/docs.md) or take a look at the notebook [example.ipynb](https://github.com/weekend37/string-kernels/blob/master/example.ipynb) for further demonstration of usage. If you end up using this in your research we kindly ask you to cite us! :) diff --git a/doc/fig/docs.md b/doc/fig/docs.md new file mode 100644 index 0000000..4cb02e4 --- /dev/null +++ b/doc/fig/docs.md @@ -0,0 +1,54 @@ +# Documentation + +## kernels.string_kernel + +**Wrapper for a singly vectorized linear time string kernel implentation for data matrices X and Y** +```python + Parameters + - normalzie : bool, default=True + indicates if the kernel output should be normalized s.t. max(K) <= 1 + - n_jobs : int, default=None + how many CPUs to distribute the process over. If None, use maximum available CPUs. + + Returns + - string_kernel_func : function + function that takes in two data matrices X and Y as arguments + (np.ndarray's of shapes (NX,MX) and (NY, MY) where N_ is the number of samples and M_ is sequence length) + and returns the string kernel value between product of all samples in X and Y (int, float depending on normalization) +``` + +**Example** + +```python +from sklearn import svm +from stringkernels.kernels import string_kernel +model = svm.SVC(kernel=string_kernel(n_jobs=32)) +``` + +## kernels.polynomial_string_kernel + +**Wrapper for a linear time polynomial string kernel distance implentation for two data matrices X and Y for a monomial with exponent p to run across n_jobs different CPUs.** +```python + Parameters + - p: float or int, default = 1.2 + exponent of the monomial which will be used + - normalzie : bool, default=True + indicates if the kernel output should be normalized s.t. max(K) <= 1 + - n_jobs : int, default=None + how many CPUs to distribute the process over. If None, use maximum available CPUs. + + Returns + - polynomial_string_kernel_func : function + function that takes in two data matrices X and Y as arguments + (np.ndarray's of shapes (NX,MX) and (NY, MY) where N_ is the number of samples and M_ is sequence length) + and returns the polynomial string kernel value between product of all samples in X and Y (float) + +``` + +**Example** + +```python +from sklearn import svm +from stringkernels.kernels import polynomial_string_kernel +model = svm.SVC(kernel=polynomial_string_kernel(p=1.1)) +``` \ No newline at end of file