Replies: 7 comments
-
did you make any progress on this @zswaff ? we're wanting to do the same. |
Beta Was this translation helpful? Give feedback.
-
@jonathon-love yes, we have a solution. I'll post about it here in a few hours |
Beta Was this translation helpful? Give feedback.
-
So the solution we have at Dart is that we forked ypy and added the functionality there that is required to get/set the right types in the lexical-produced In short, if you replace your dependency for from humps import camelize, decamelize
from y_py import YDoc, YXmlElement, YXmlText
NO_REMAP_YXML_KEYS = {
"children",
"text",
}
NO_CONTENT_NODES = {
"linebreak",
}
DECORATOR_NODES = {
"horizontal-rule",
}
def _encode_yxml_key(k: str) -> str:
if k in NO_REMAP_YXML_KEYS:
return k
if k == "direction":
return "__dir"
return f"__{camelize(k)}"
def _decode_yxml_key(k: str) -> str:
if k in NO_REMAP_YXML_KEYS:
return k
if k == "__dir":
return "direction"
return decamelize(k[2:])
def from_lexical_to_ydoc(obj: dict, doc: YDoc) -> None:
yroot: YXmlElement = doc.get_xml_element("root")
with doc.begin_transaction() as txn:
q: list[tuple[YXmlElement | YXmlText, dict]] = [(yroot, obj["root"])]
while q:
curr_yobj, curr_obj = q.pop(0)
if curr_obj["type"] == "listitem":
curr_obj["indent"] = 0
for key, value in curr_obj.items():
if key == "children" and isinstance(value, list):
for child in value:
if "text" in child and isinstance(curr_yobj, YXmlText):
text = child["text"]
del child["text"]
curr_yobj.push_attributes(txn, {_encode_yxml_key(k): v for k, v in child.items()})
curr_yobj.push(txn, text)
continue
child_kind = child["type"]
if child_kind == "linebreak" and isinstance(curr_yobj, YXmlText):
curr_yobj.push_attributes(txn, {_encode_yxml_key(k): v for k, v in child.items()})
continue
if child_kind in DECORATOR_NODES:
ychild = curr_yobj.push_xml_element(txn, child_kind)
else:
ychild = curr_yobj.push_xml_text(txn)
q.append((ychild, child))
else:
curr_yobj.set_attribute(txn, _encode_yxml_key(key), value)
def from_ydoc_to_lexical(doc: YDoc) -> dict:
yroot = doc.get_xml_fragment("root")
root = yroot.to_dict()
q: list = [root]
while q:
obj = q.pop()
if isinstance(obj, dict):
keys = list(obj.keys())
for key in keys:
val = obj.pop(key)
if isinstance(val, float) and val.is_integer():
val = int(val)
q.append(val)
obj[_decode_yxml_key(key)] = val
if isinstance(obj, list):
q += obj
return {"root": root} Let me know if you have any problems, or suggestions for how we can improve. |
Beta Was this translation helpful? Give feedback.
-
hahaha, this is fantastic! we defs owe you a beer! |
Beta Was this translation helpful? Give feedback.
-
hi @zswaff, i've been adapting
i expect you guys have adapted y-py to your needs, and aren't really interested in maintaining, or merging additional features ... but i thought i'd ask, just in case you would be interested in reviewing/merging some stuff? with thanks |
Beta Was this translation helpful? Give feedback.
-
Hey @jonathon-love, nice to hear from you. Yeah, interesting situation. One way or another I still very much dream of combining these repos and not maintaining these forks long term. How much work do you anticipate doing based on this fork? Are the changes ypy or y-crdt? In the short term, happy to help how we can. If you open PR(s) against the fork(s) we will do our best to take a look when we can and can play it by ear a bit. Feel free to open stuff. |
Beta Was this translation helpful? Give feedback.
-
hey, i don't anticipate us doing too much work on the fork, but i've made a handful of mods to ypy. righto, i'll open a PR and we'll see what you guys think ... thanks! cheers |
Beta Was this translation helpful? Give feedback.
-
Background
Hey folks! I'm adding some context to my interest in #117 for the benefit of @stefanw and anyone else.
At Dart we use lexical for text editing. Lexical comes with YJS bindings so we're working on collaboration with YJS. Our backend is in Python and we need to manipulate the
YDoc
s on the backend so our goal is to be able to parse and createYDoc
s following the lexical format. The lexical code is famously a bit dense and not so well-documented yet but the operating code on their end is here.Particulars
Example
My toy example is to create a lexical doc with content corresponding to approximately
This same content in the lexical JSON format is included, minimized, here
Lexical data
and those same data can be encoded as a YJS update with the string
b'\x01a\x9c\xb5\xe4\xcf\x0e\x00(\x01\x04root\x05__dir\x01w\x03ltr\x07\x01\x04root\x06(\x00\x9c\xb5\xe4\xcf\x0e\x01\x06__type\x01w\tparagraph(\x00\x9c\xb5\xe4\xcf\x0e\x01\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e\x01\x08__indent\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e\x01\x05__dir\x01w\x03ltr\x07\x00\x9c\xb5\xe4\xcf\x0e\x01\x01(\x00\x9c\xb5\xe4\xcf\x0e\x06\x06__type\x01w\x04text(\x00\x9c\xb5\xe4\xcf\x0e\x06\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e\x06\x07__style\x01w\x00(\x00\x9c\xb5\xe4\xcf\x0e\x06\x06__mode\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e\x06\x08__detail\x01}\x00\x84\x9c\xb5\xe4\xcf\x0e\x06\x01a\x87\x9c\xb5\xe4\xcf\x0e\x01\x06(\x00\x9c\xb5\xe4\xcf\x0e\r\x06__type\x01w\x04list(\x00\x9c\xb5\xe4\xcf\x0e\r\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e\r\x08__indent\x01}\x00!\x00\x9c\xb5\xe4\xcf\x0e\r\x05__dir\x01(\x00\x9c\xb5\xe4\xcf\x0e\r\n__listType\x01w\x06number(\x00\x9c\xb5\xe4\xcf\x0e\r\x05__tag\x01w\x02ol(\x00\x9c\xb5\xe4\xcf\x0e\r\x07__start\x01}\x01\x07\x00\x9c\xb5\xe4\xcf\x0e\r\x06(\x00\x9c\xb5\xe4\xcf\x0e\x15\x06__type\x01w\x08listitem(\x00\x9c\xb5\xe4\xcf\x0e\x15\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e\x15\x08__indent\x01}\x00!\x00\x9c\xb5\xe4\xcf\x0e\x15\x05__dir\x01(\x00\x9c\xb5\xe4\xcf\x0e\x15\x07__value\x01}\x01\x01\x00\x9c\xb5\xe4\xcf\x0e\x15\x01\x00\x05\x81\x9c\xb5\xe4\xcf\x0e\x1b\x01\x84\x9c\xb5\xe4\xcf\x0e\x0c\x01 \x87\x9c\xb5\xe4\xcf\x0e"\x01(\x00\x9c\xb5\xe4\xcf\x0e#\x06__type\x01w\x04text(\x00\x9c\xb5\xe4\xcf\x0e#\x08__format\x01}\x01(\x00\x9c\xb5\xe4\xcf\x0e#\x07__style\x01w\x00(\x00\x9c\xb5\xe4\xcf\x0e#\x06__mode\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e#\x08__detail\x01}\x00\x84\x9c\xb5\xe4\xcf\x0e#\x01b\xa1\x9c\xb5\xe4\xcf\x0e\x11\x01\xa1\x9c\xb5\xe4\xcf\x0e\x19\x01\xa8\x9c\xb5\xe4\xcf\x0e*\x01w\x03ltr\xa8\x9c\xb5\xe4\xcf\x0e+\x01w\x03ltr\x87\x9c\xb5\xe4\xcf\x0e!\x01(\x00\x9c\xb5\xe4\xcf\x0e.\x06__type\x01w\x04text(\x00\x9c\xb5\xe4\xcf\x0e.\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e.\x07__style\x01w\x00(\x00\x9c\xb5\xe4\xcf\x0e.\x06__mode\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e.\x08__detail\x01}\x00\x84\x9c\xb5\xe4\xcf\x0e.\x01c\x81\x9c\xb5\xe4\xcf\x0e\x15\x01\x00\x05\x87\x9c\xb5\xe4\xcf\x0e5\x06(\x00\x9c\xb5\xe4\xcf\x0e;\x06__type\x01w\x08listitem(\x00\x9c\xb5\xe4\xcf\x0e;\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0e;\x08__indent\x01}\x00!\x00\x9c\xb5\xe4\xcf\x0e;\x05__dir\x01(\x00\x9c\xb5\xe4\xcf\x0e;\x07__value\x01}\x02\x07\x00\x9c\xb5\xe4\xcf\x0e;\x06(\x00\x9c\xb5\xe4\xcf\x0eA\x06__type\x01w\x04list(\x00\x9c\xb5\xe4\xcf\x0eA\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0eA\x08__indent\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0eA\x05__dir\x01w\x03ltr(\x00\x9c\xb5\xe4\xcf\x0eA\n__listType\x01w\x06number(\x00\x9c\xb5\xe4\xcf\x0eA\x05__tag\x01w\x02ol(\x00\x9c\xb5\xe4\xcf\x0eA\x07__start\x01}\x01\x07\x00\x9c\xb5\xe4\xcf\x0eA\x06(\x00\x9c\xb5\xe4\xcf\x0eI\x06__type\x01w\x08listitem(\x00\x9c\xb5\xe4\xcf\x0eI\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0eI\x08__indent\x01}\x00!\x00\x9c\xb5\xe4\xcf\x0eI\x05__dir\x01(\x00\x9c\xb5\xe4\xcf\x0eI\x07__value\x01}\x01\xa8\x9c\xb5\xe4\xcf\x0eM\x01w\x03ltr\x07\x00\x9c\xb5\xe4\xcf\x0eI\x01(\x00\x9c\xb5\xe4\xcf\x0eP\x06__type\x01w\x04text(\x00\x9c\xb5\xe4\xcf\x0eP\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0eP\x07__style\x01w\x00(\x00\x9c\xb5\xe4\xcf\x0eP\x06__mode\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0eP\x08__detail\x01}\x00\x84\x9c\xb5\xe4\xcf\x0eP\x01d\x81\x9c\xb5\xe4\xcf\x0eI\x01\x00\x05\x81\x9c\xb5\xe4\xcf\x0e;\x01\x00\x05\xa8\x9c\xb5\xe4\xcf\x0e?\x01~\x87\x9c\xb5\xe4\xcf\x0e\r\x06(\x00\x9c\xb5\xe4\xcf\x0ed\x06__type\x01w\tparagraph(\x00\x9c\xb5\xe4\xcf\x0ed\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0ed\x08__indent\x01}\x00!\x00\x9c\xb5\xe4\xcf\x0ed\x05__dir\x01\xa8\x9c\xb5\xe4\xcf\x0eh\x01w\x03ltr\x07\x00\x9c\xb5\xe4\xcf\x0ed\x01(\x00\x9c\xb5\xe4\xcf\x0ej\x06__type\x01w\x04text(\x00\x9c\xb5\xe4\xcf\x0ej\x08__format\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0ej\x07__style\x01w\x00(\x00\x9c\xb5\xe4\xcf\x0ej\x06__mode\x01}\x00(\x00\x9c\xb5\xe4\xcf\x0ej\x08__detail\x01}\x00\x84\x9c\xb5\xe4\xcf\x0ej\x01e\x01\x9c\xb5\xe4\xcf\x0e\t\x11\x01\x19\x01\x1b\x07*\x025\x06?\x01M\x01W\x0ch\x01'
Parsing the example with Typescript
To parse the update with Typescript, the code below will recreate the expected lexical data. This code is actually still a bit awkward because the child nodes can be any of
Y.XmlFragment | Y.Map<Y.Item> | Y.Text | string
. I am not really sure at the moment whether that is lexical weirdness or because YJS is giving me aMap
when it should really be anXmlFragment
.Anyway, the generic code to recreate a lexical tree that I have is
Parsing the example with Python
Currently, there is no way that I know of to parse the provided data with Python--that is where my interest in #117 stems from.
The test that I would love to be able to pass, though, would be that this code produces the same result as the Typescript version does (i.e. the lexical data in the collapsed section above).
although, again, this doesn't work right now.
Beta Was this translation helpful? Give feedback.
All reactions