-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conformance tests with JavaScript reference implementation #60
Comments
I tried this in the past, but it turned out to be very difficult to generate exactly what the JS implementation produces. You really need to reimplement all the quirks of that implementation, some of which don't seem to make much sense. There are also places where we intentionally generate different MathML as well. Instead of that I've made a manual approximation of that in https://github.com/asciidoctor/asciimath/blob/master/spec/parser_spec.rb |
I've been looking into AsciiMath recently, specifically I was looking for a parser for .NET so I threw together a basic port of the AsciiDoctor code. I also tried running against the asciimath.js unit tests from here: https://github.com/asciimath/asciimathml/blob/master/test/unittests.js About 50% of the tests pass, and 50% fail. Trouble is, like you point out, it's quite a lot of effort to work out which of tests are expected/desirable and which aren't 😅 I've done some basic tests, and I think you can split them into a few broad categories. Not suggesting these need to be addressed, just documenting here for prosperity 🙂 Grouping differences There are many cases where AsciiDoctor adds extra However there's also whereas the reference code renders as Whether that's strictly a bug or a feature is likely more nuanced 😄 Issues with negative numbers One good example is whereas the reference code renders as I did wonder whether the intention was for asciimath/lib/asciimath/parser.rb Lines 91 to 92 in efb2f57
Cases where the reference is lenient to spelling mistakes There are some cases where the reference code "knows" additional symbols as synonyms for others
These are the most clear cut "will not fix" candidates Differences in symbol rendering Differences in the rendering of symbols. Some of these seem reasonable, but for others I wonder which is more correct 🤔
The difference in Differences in "invalid" rendering There's some differences in how certain "incorrect" syntaxes are rendered. For example, <math><mn>2</mn></math> whereas the references produces: <math><msup><mn>2</mn><mo></mo></msup></math> This doesn't seem like a big deal seeing as the syntax is "invalid" anyway. |
In general the issue is that there isn't really a formal grammar for asciimath. I based myself on the partial BNF grammar for this implementation. If you follow those rules, then afaict this parser parses
and that parses as ![]() |
For the other issues most of these are clear bugs or improvements. It would be useful to split these up into distinct issues to make the changes a bit easier to track. |
I am thinking of adding some conformance checks which will ensure that converting given formula with AsciiMath gem gives the same result as converting it with AsciiMath's original JavaScript implementation. Unless there are some differences which have been introduced for purpose, of course…
The idea is to prepare a list of example AsciiMath formulas, the longer the better, then convert every single of them with both implementations, and then compare the results. The whole process could be written as follows:
This kind of tests requires having some JavaScript runner in development environment (nodejs or mini_racer).
This is easy to implement and I can help with that. Hopefully, these tests will help to early detect bugs like #58.
The text was updated successfully, but these errors were encountered: