-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve parsing (almost) empty files #156
Improve parsing (almost) empty files #156
Conversation
So the thing that you observed is easy to fix, but while fixing this I found another issue that is not easy to fix😅 Files that have only 1 character, e.g. a line feed ( Not sure what a good way to fix this is. Maybe check the file's content and fall back to plain text if the length is 1 byte and the character is printable... |
I implemented manual checking now:
|
Tested and works beautifully now 👍 Implementation looks quite painful and what an unexpected problem once again 😄 Do you think it's good & worth it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd approve based on testing this
Yeah haha. I do think it's good & worth it. It was definitely a bug, no way to spin that around haha, so it was necessary to fix. The implementation should be very robust. 0-byte files will always be empty and 1-byte files with a printable character can't be anything "useful" that serves some purpose that breaks when we interpret them as text files. Performance-wise it's also fine: For the majority of cases (files bigger than 1 byte) it's the same as before (the child process and getting file stats are both outside processes executed in parallel) and for 1-byte files the complexity of reading 1 byte from a file is added which is negligible. |
Ok, sounds good! |
Close #155