MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
-
Updated
Jun 4, 2024 - Python
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Go metrics for calculating string similarity and other string utility functions
Compare html similarity using structural and style metrics
A package to compute medical segmentation metrics.
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..
A Clojure library for querying large data-sets on similarity
Spark functions to run popular phonetic and string matching algorithms
SetSketch: Filling the Gap between MinHash and HyperLogLog
Calculate various string metrics efficiently in Haskell
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Aim is to come up with a job recommender system, which takes the skills from LinkedIn and jobs from Indeed and throws the best jobs available for you according to your skills.
BagMinHash - Minwise Hashing Algorithm for Weighted Sets
Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.
Easy-to-use Java library for similarity checking of strings or numeric-series
This is an implementation of the paper written by Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett
A text similarity computation using minhashing and Jaccard distance on reuters dataset
Text Matching Based on LCQMC: A Large-scale Chinese Question Matching Corpus
Locality Sensitive Hashing for semantic similarity (Python 3.x)
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Add a description, image, and links to the jaccard-similarity topic page so that developers can more easily learn about it.
To associate your repository with the jaccard-similarity topic, visit your repo's landing page and select "manage topics."