Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mediawiki reader: improve strong/emph conformance #10766

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

silby
Copy link
Contributor

@silby silby commented Apr 7, 2025

cf. #10761 and #3044.

I made some progress with this today without completely blowing up the existing strong and emph parsers but weird edge cases remain. E.g. consider ''foo''''bar''. Pandoc today will give you Emph [ Str "foo" , Str "bar" ], which has an obvious appeal. My work in progress gives Emph [ Str "foo''" ] , Str "bar''", which is odder but defensible given other requirements for emphasized quote marks. The actual correct answer, according to MediaWiki, is Emph [ Str "foo'" , Strong [ Str "bar" ] ], i.e. foo'bar, which is basically a koan.

Parsoid has a lot of code just for processing quotes, presumably aiming to maintain bug-for-bug compatibility with whatever MediaWiki's first parser did. So what a string of single-quotes means varies depending on what comes after it in the line, in a more context-sensitive way than I expected.

Would it be better to merge code that makes us more conformant with MediaWiki for some cases and "wrong in a different way" for others, or to try to reach perfection?

@jgm
Copy link
Owner

jgm commented Apr 8, 2025

In general I'm not too concerned with divergences in edge cases. Nobody is ever going to write ''foo''''bar'' and intend to get emph "foo'" + strong "bar". Your original case ''foo''', by contrast, seems like something that would come up naturally.

@jgm
Copy link
Owner

jgm commented Apr 8, 2025

Is Parsoid the parser mediawiki uses? Or is that something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants