Skip to content

v1.0.0: Refactor and modernize, spaCy v2.2 support, more features, 2019 vectors model & Prodigy recipes

Compare
Choose a tag to compare
@ines ines released this 22 Nov 15:07
· 54 commits to master since this release
5a2c0a9

✨ New features and improvements

  • Completely rewrite package from scratch.
  • Replace built-in vector storage with spaCy's Vectors, making this package a pure Python package and allowing easy out-of-the-box serialization of vectors.
  • Add fully serializable spaCy pipeline component and extension attributes.
  • Add new methods get_best_sense and get_other_senses and improve most_similar.
  • Add script for precomputing index of nearest neighbors for super fast "most similar" queries.
  • Add annotation recipes for Prodigy to easily create word lists and match patterns from similar phrases using sense2vec vectors (like the terms.teach recipe, just with multi-word expressions).
  • New and more efficient training and preprocessing scripts using GloVe and fastText.

⚠️ Backwards incompatibilities

  • The sense2vec.load method has been removed. Use Sense2Vec.from_disk instead.
  • The previous VectorMap and VectorStorage have been removed.
  • This package now requires Python 3.6+.
  • This update requires a new vectors format (see attached files).

📖 Documentation and examples

  • Rewrite README from scratch and include full API docs.

👥 Contributors

Thanks to @kabirkhan for contributing the initial Prodigy recipes!