-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode 16.0 #60
Unicode 16.0 #60
Conversation
@Carbon225 There is pyunormalize for unicode 16.0 |
I wrote this code: for cp in tqdm(range(0, 0x10FFFF)):
c = chr(cp)
e = ens_process(f'bitcoin.bitcin.bi̇tcin.bitсin{c}').error
try:
assert e.general_info == 'Contains visually confusing characters from multiple scripts (Latin plus other scripts)'
assert (
e.sequence_info
== 'This character is disallowed because it is visually confusing with another character from the Latin script'
)
print(f'{cp} - OK')
break
except AssertionError:
pass It should find a new example for this test. I will make this test xfail, most likely all codepoints are now included in ens-normalize-js. |
I changed this in the readme:
I did not want people thinking we are using some old reference implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Carbon225 @djstrong Looks good!
I shared 3 comments with feedback. Suggest we action the comment I shared on the README.md
file in this PR. For the other comments, appreciate your advice and good to action those separately in a different PR.
README.md
Outdated
@@ -10,7 +10,7 @@ | |||
* This library is being maintained by the team at [NameHash Labs](https://namehashlabs.org) as part of the greater [NameGuard](https://nameguard.io) solution to help protect the ENS community. | |||
* Passes **100%** of the [official validation tests](https://github.com/adraffy/ens-normalize.js/tree/main/validate) (validated automatically with pytest on Linux, MacOS, and Windows, see below for details). | |||
* Passes an [additional suite of further tests](/tools/updater/update-ens.js#L54) for compatibility with the official [Javascript reference implementation](https://github.com/adraffy/ens-normalize.js) and code testing coverage. | |||
* Based on [JavaScript implementation version 1.9.0](https://github.com/adraffy/ens-normalize.js/tree/4873fbe6393e970e186ab57860cc59cbbb1fa162). | |||
* Original implementation based on [JavaScript implementation version 1.9.0](https://github.com/adraffy/ens-normalize.js/tree/4873fbe6393e970e186ab57860cc59cbbb1fa162). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Original implementation based on [JavaScript implementation version 1.9.0](https://github.com/adraffy/ens-normalize.js/tree/4873fbe6393e970e186ab57860cc59cbbb1fa162). | |
* Supports updates to the ENS Normalize reference implementation for Unicode 16.0. |
Spent some time thinking about this line. I think it's best we just fully remove the old idea / bullet point here. I agree it made since when we first released this library, but we already are making reference to Raffy's JS implementation up above and no benefit for anyone if we go into this detail here.
Alternatively, I think it's now useful to make a note here to identify which Unicode version we've updated this library to support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lightwalker-eth I like the idea of mentioning unicode support. But what do you mean by "supports updates"?
Is it supposed to mean "Up-to-date with ENS Normalize reference implementation as of Unicode version 16.0"?
I'm happy to go with your version if you think it sounds ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Carbon225 Good catch. My suggestion could have been more clear. I like the way you proposed writing it:
"Up-to-date with ENS Normalize reference implementation as of Unicode version 16.0."
Signed-off-by: kwrobel.eth <djstrong@gmail.com>
Validation tests pass.
This test needs to be updated (we need to find a character that is not assigned to a script, 771 is now Hebrew):
There is no pyunormalize for unicode 16.0.
UnicodeWarning: Unicode version mismatch: pyunormalize is using 15.1.0, but the ENS Normalization spec is for 16.0.0 (2024-09-10T20:47:54.200Z)