-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Schema::open with simple quickstart_dense test #5
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a number of comments, the important bits are around lifetime stuff. I know its fairly confusing to intuit some of our APIs given that you've done Domain stuff where the returned pointers are "borrows" that require you to copy stuff out. In these cases where you're holding pointers to internal bits, I'd argue that we should avoid that at all costs as those internal handle classes are designed specifically for allowing wrappers to avoid a lot of complicated lifetime issues.
Or in other words, for almost all of the tiledb_$thing_t
types, they're meant to be used in an RAII style class. I.e., the pointer gets set in a "constructor" and then freed in the "destructor" where I'm using quotes because there are different names for those in various languages.
An example where we're doing it almost correctly is here in the filter list:
https://github.com/TileDB-Inc/tiledb-rs/blob/main/src/filter_list.rs#L69-L90
Almost because that's till using the Borrowed pattern which is a memory leak in these cases. But that approach to just returning an instance of the thing we're getting should be the preference unless there's a very compelling reason why that could be wrong (only thing that comes to mind would be caching expensive calls, but even then, I'd probably suggest that cache be upstream in user code rather than something we do internally. Also, expensive is relative there, as I'd expect performance of these calls to be in the 10-100 kHz range).
src/array/schema.rs
Outdated
let rows = domain.dimension(0).expect("Error reading rows dimension"); | ||
assert_eq!(Datatype::Int32, rows.datatype()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to illustrate the lifetimes, if you were to put this in a loop so that you called this N times to create the dimension and then drop it, internally it'd be just fine. Shoving the returned dimensions into a vector or similar would also be fine. You'd have N distinct allocated handle objects that all proxy through to the underlying internal shared_ptr<tiledb::sm::Dimension>
.
I'm pleased to report that removing the interior secondary state is perfectly fine and I have now done that. Changes are addressed. |
907a760
to
e56a3cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
e56a3cd
to
7548274
Compare
This pull request adds an equivalent of the
tiledb_array_schema_load
function which opens an array schema from a URI.Prior work with array schema creation put the Domain in as a field in the
ArraySchema
, so most of the implementation here is actually about that.ArraySchema::open
callsDomain::load
which fetches into Rust the information needed from the C API."Prior work" was the focus on the
quickstart_dense
example which creates, rather than reads, an array. In that example we had to make sure that the Domain lifetime was at least as long as the Schema lifetime, and then Domain has a vector of Dimension for the same reason.Since the tests for this naturally want to ask about the dimensions, there are also changes to the
Domain::dimension
function, specifically to return a reference to an existing one rather than create a new one. This uses interior mutability to lazily manifest the dimensions from the C API.