Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Ceramic metadata field to use $comment field #104

Merged
merged 5 commits into from
Sep 7, 2021

Conversation

PaulLeCam
Copy link

  • Uses $comment instead of $ceramic field in JSON schemas
  • Replaces mentions of document/DocID by stream/StreamID
  • Uses tags metadata for deterministic slice access for append-collection

@PaulLeCam PaulLeCam requested review from stbrody and oed June 14, 2021 15:12
Copy link
Contributor

@stbrody stbrody left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main comments are around thinking about how we use and think about the metadata

type: 'tile',
schema: '<Note schema docID>',
},
$comment: 'ceramic:tile:<Note schema streamID>',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A part of me wonders if we'd be better off using the streamtype code rather than name (so 0 for tile, 1 for caip10link, etc). I'm worried that once we have a large number of stream types that all represent json documents just with different log structures, coming up with meaningful human readable names for them all is going to get tricky...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree there might be a better namespace than tile but I'm not sure stream types are so relevant here, because the purpose is to interact with other JSON documents anyways?
At this level I don't think it matters much if the JSON doc is backed by a tile, CRDTs or any other stream type, but rather than it can be resolved by the Ceramic client, no?

Copy link
Contributor

@stbrody stbrody Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, good point. In that case, is there any reason to include the streamtype name at all? Could it just be ceramic:<streamID>? The StreamID already has the streamtype embedded within it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tile here was meant as a namespace, as we have ceramic:appendCollection:<slice schema ID> and ceramic:collectionSlice that have different meaning.
Basically ceramic:tile should just mean "this string points to a Ceramic JSON document" and ceramic:tile:<schema ID> means a reference to a JSON doc having a specific schema.
Maybe we can just use ceramic:doc or ceramic:ref instead if it is more explicit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's gained by this namespacing? Is it just for human readability? Or does the user or code many any decisions based on the namespace?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I replaced tile by doc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like that we invent a new format of encoding streamids here.

I propose that we use the standard url for streamids with an additional query param:

  • ceramic://<streamid>?json-schema=doc
  • ceramic://<streamid>?json-schema=appendCollection

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These specs are about metadata in the schemas themselves, not about the contents.
In ceramic:doc:<id> the id is the stream ID of the schema the referenced doc should use, not the contents doc.

These strings don't even necessarily represent stream IDs, for example ceramic:appendCollection is just a way for the tools to detect a schema represents an AppendCollection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that, but I don't see why we need to introduce another representation. As far as I can tell we always include the StreamID. Any problem with using the formatting I suggested above so that we don't introduce a new representation?

Copy link
Author

@PaulLeCam PaulLeCam Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are different semantics that are specific to the type rather than stream IDs, for example ceramic:collectionSlice doesn't include a stream ID, ceramic:appendCollection:<slice schema ID> only allows a single stream ID, and ceramic:doc can use multiple stream IDs (ceramic:doc:<schema ID 1>|<schema ID 2> for example).

A collection slice should contain a maximum of 256 items, so the index of an item in the slice can be represented in a single byte.

#### Slice tag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I wonder if it makes sense to use tags in this way, or if we should wait for ceramicnetwork/js-ceramic#1342 and use a more semantically meaningful object structure in the metadata for this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we should use whatever makes most sense for Ceramic, I wasn't aware there would be a new field in the metadata for this.
Might be good to clarify the relations and differences between family, tags and this new field i guess, as I thought family was the field meant for indexing (at least in the context of IDX).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is for both family and tags to be used by indexers

Copy link
Contributor

@stbrody stbrody Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my vote would be to do this with the new metadata object from ceramicnetwork/js-ceramic#1342 so the field names could have more semantic meaning, but I don't feel super strongly and could be convinced it makes sense to do this with tags

Copy link
Member

@oed oed Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

family is the primary key used for indexing. I agree with @stbrody's point unless we actually want the indexer to more easily find it using the tags.

Co-authored-by: Spencer T Brody <spencer.t.brody+github@gmail.com>
@stbrody
Copy link
Contributor

stbrody commented Jun 15, 2021

I've laid out my thoughts as best as I can, but there are still 2 open comment threads which need some final decisions made. I just made a last comment on each thread summarizing my current thinking. Now I think we need @oed to weigh in.

@PaulLeCam
Copy link
Author

@oed @stbrody do you think we could merge this as-is for now please? We can always update the specs based on the implementation changes.

@oed
Copy link
Member

oed commented Jul 1, 2021

@PaulLeCam made a proposal for how to use our existing url encoding for streamids

Copy link
Member

@oed oed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not convinced inventing a new namespace for our json-schema usage is needed. Provided some more comments in this review.

I think we should represent CIP-82 using ceramic://<streamid> for single schemas and ceramic://<streamid-1>,ceramic://<streamid-2> for multiple schemas. If we need additional metadata like that this schema represents an appendCollection we can add a query: ceramic://<streamid>?json-schema=appendCollection

}
```

## Rationale

This CIP uses the `$ceramic` namespace defined in [CIP-88](https://github.com/ceramicnetwork/CIP/blob/main/CIPs/CIP-88/CIP-88.md).
This CIP uses the `$comment` field as defined in [CIP-88](https://github.com/ceramicnetwork/CIP/blob/main/CIPs/CIP-88/CIP-88.md).

This spec allows to either define a single schema (using a string) or multiple ones (array of strings).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array of strings seems incorrect here.


## Motivation

Storing a list of items in a single Ceramic document can make the document grow large in size, possibly exceeding Ceramic's internal limits.
Storing a list of items in a single Ceramic stream can make the stream grow large in size, possibly exceeding Ceramic's internal limits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true for Tiles today, but it not really true for future stream implementations. I would add wording about that this is a specific problem for Tiles.

- `sliceIndex`: index of the slice in the collection, between `0` and the collection's `slicesCount` minus `1`
- `contents`: array with a `maxItems` value matching the `AppendCollection` `sliceMaxItems` property and defining the items schemas, that must include `{ type: 'null' }` in order to support removals

```js
{
$schema: 'http://json-schema.org/draft-07/schema#',
$ceramic: { type: 'collectionSlice' },
$comment: 'ceramic:collectionSlice',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this property needed here? I don't see any rationale for it. The fact that the schema is referenced from the append collection should be enough context for us to know that it's a collection slice.

@oed
Copy link
Member

oed commented Aug 11, 2021

What's the status of this effort @PaulLeCam? I think an update is needed for #87

@PaulLeCam
Copy link
Author

IIRC it should be good to merge, the main change that could be required is replacing the use of tags for AppendCollection but anyways the AppendCollection CIP will likely need other updates as we decide on an implementation.

Copy link
Member

@oed oed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@PaulLeCam PaulLeCam merged commit bb5ea0b into main Sep 7, 2021
@PaulLeCam PaulLeCam deleted the update-schema-meta-cips branch September 7, 2021 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants