Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cabal-docspec: Fails to detect examples with non-standard unicode characters present #131

Open
amesgen opened this issue Jun 13, 2023 · 6 comments

Comments

@amesgen
Copy link

amesgen commented Jun 13, 2023

E.g. this example is not detected:

-- >>> length "猫"
-- 1

This is due to haskell-lexer not supporting this:

 Λ L.lexerPass0 "-- >>> length \"\"\n"
[(Commentstart,(Pos {char = 0, line = 1, column = 1},"--")),(ErrorToken,(Pos {char = 2, line = 1, column = 3}," >>> length \"")),(TheRest,(Pos {char = 15, line = 1, column = 16},"\29483\"\n"))]

This could be fixed in haskell-lexer (but maybe this actually is accurate behavior for Haskell 98?), but as stubbornPass0 already works around another haskell-lexer deficiency, it might be acceptable to do so again in this case? See #132 for that.

Minimal reproducible example

https://github.com/amesgen/stuff/tree/cabal-docspec-unicode

 $ git clone https://github.com/amesgen/stuff --single-branch --branch cabal-docspec-unicode
 $ cd stuf && cabal build
 $ cabal-docspec
Total:         0; Tried:    0; Skipped:    0; Success:    0; Errors:    0; Failures    0
Examples:      0; Tried:    0; Skipped:    0; Success:    0; Errors:    0; Failures    0
@amesgen amesgen changed the title Fails to detect examples with non-standard unicode characters present cabal-docspec: Fails to detect examples with non-standard unicode characters present Jun 13, 2023
@LeventErkok
Copy link

@amesgen Just ran into this very same issue.

Are there any workarounds, or fixes planned?

@amesgen
Copy link
Author

amesgen commented Jan 17, 2025

The workaround in #132 (just rebased) still works fine for me, but it is rather ad-hoc.

@LeventErkok
Copy link

Thanks @amesgen

This doesn't work for me, unfortunately. My code looks like:

data SStroke

-- | The sheffer stroke operator
(︱) :: SStroke -> SStroke -> SStroke
(︱) = uninterpret ""
infixl 7 

-- | Negation in terms of ǀ:: SStroke -> SStroke
ﬧ x = x x

It appears the definition for ﬧ is throwing things off.

Do you think this is a separate issue?

@amesgen
Copy link
Author

amesgen commented Jan 17, 2025

Yeah, that being in the code (and not just in a comment), so the workaround in #132 indeed doesn't apply. It's still the same general issue with haskell-lexer though, and maybe it can be worked around in a similar way as #132, but it is probably more appealing to properly fix haskell-lexer at this point.

@LeventErkok
Copy link

@amesgen I filed an issue against haskell-lexer; looks like Iavor is maintaining it and hopefully he can get a fix put in.

Do you have a patch you can submit against it? Or commenting on the ticket regarding the details might help Iavor get to the issue quicker. Thanks!

@LeventErkok
Copy link

@amesgen

Looks like Iavor is making progress. But there's another issue: Shouldn't doctest complain when lexing fails? Right now it appears it simply ignores those files (or portions of it?) that it cannot lex.

Should we be filing another issue where these failures cause erroring out? Otherwise, a user might think things are getting tested, which are just not run because of some lexing/parsing issue kicks in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants