-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type erased feature vector and feature vector array classes #210
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apis/python/test/array_paths.py
Outdated
|
||
import os | ||
# TODO Use python Pathlib | ||
# m1_root = "/Users/lums/TileDB/TileDB-Vector-Search/external/data/gp3/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
Same bellow for vector_search_root
…ector-Search into lums/tmp/type-erased
ihnorton
approved these changes
Jan 24, 2024
This was referenced Feb 12, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Type Erased FeatureVector and FeatureVectorArray
This is a recap of the feature vector and feature vector array type-erasure component of #154.
FeatureVector
andFeatureVectorArray
. Since one of the goals of having the type-erased classes is to seamlessly integrate with Python, the Python bindings for these classes were also added.num_vectors()
replacedsize()
in numerous places.Files Added (Copied from #173)
Files Copied Over Previous
Files Modified
In addition, the following had small modifications (mostly to change
size()
tonum_vectors()
):Python Bindings
Added (Copied from #173)
Modified
NOTE: The type-erased Python binding files that were copied over included code for index classes. This code has been temporarily commented out (usually with #ifdef 0), pending the next PR.
Arrays
Overview of Type Erasure
(See also the README.md in include/api).
Type erasure is accomplished as a three-layer cake:
std::unique_ptr
to the abstract base class as a member variable. During construction (either by passing in already constructed vectors or by reading the index from a TileDB group, the appropriate template types for the internal data to be stored by the internal implementation are inferred and an object of the implementation class is constructed and stored in thestd::unique_ptr
.To illustrate the basic idea, consider
FeatureVector
. In abbreviated form, where we just show a single function 'data', looks like this:When that constructor is invoked, it first reads the schema associated with the
uri
and creates an implementation object based on that type. For example, if the type read from the schema (feature_type
) is one of afloat
oruint8
, the constructor dispatches like this:At this point, we have created a
std::unique_ptr
of the abstract base class that points to an object of the derived class.If we invoke the
data
member function of the outer (type-erased)FeatureVector
class, we dispatch to the corresponding member of the object stored in thestd::unique_ptr
:Since
feature_vector_
actually points to the derived implementation class, itsdata
member function is then invoked:We return a
void*
sincedata()
is an override of the non-templatedBase
class.(TODO: In a future PR maybe we can cast to an appropriate type extracted from the type of
vector_
?)(TODO: Is there a way to condense the boilerplate that is currently contained in all of these?)