Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assuming files are always UTF-8 encoded text #12

Open
apdevelop opened this issue Feb 16, 2023 · 2 comments
Open

Assuming files are always UTF-8 encoded text #12

apdevelop opened this issue Feb 16, 2023 · 2 comments

Comments

@apdevelop
Copy link

apdevelop commented Feb 16, 2023

Reading file contents in getFile() function of OwinCompression module implemented as StreamReader.ReadToEndAsync() + System.Text.Encoding.UTF8.GetBytes:
https://github.com/Thorium/Owin.Compression/blob/master/src/Owin.Compression/CompressionModule.fs#L156

How it will proceed with arbitrary encoded text files or binary files (for example, images and fonts) - the file contents will be corrupted in response. This applies when using .MapCompressionModule(...) and respectively ResponseMode.File.

@Thorium
Copy link
Owner

Thorium commented Feb 16, 2023

Good question, I've not had issues with it but then again I'm skipping most of the binary files with file extension types because binary images and binary fonts etc. are already compressed / doesn't benefit of zip so much.

And I guess the potential issue applies only cases where the file is directly read from disk, not cases where this is used in pipeline with selfhost or other file server?

Is this actual issue, and if it is, do you have any ideas for improvement?

@apdevelop
Copy link
Author

apdevelop commented Feb 17, 2023

And I guess the potential issue applies only cases where the file is directly read from disk

Yes, it applies to the .MapCompressionModule() -> getFile() call chain. I can agree that compressed binary files are not intended for additional compression, but DefaultCompressionSettings.AllowedExtensionAndMimeTypes for example, contains .ttf and .eot extensions which are binary files, if I remember correctly.

It can be improved by reading file in binary mode as byte array at once, like FileStream.ReadAsync (taking into account some performance issues).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants