You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently I have a project which requires some file analysis, for unit testing we use phpunit combined with vfsStream which works out pretty well most of the time.
However I encountered a problem today when trying to determine the filetype or the encoding of a file in the file-system.
Here is some insight on what I tried so far:
First I ran into trouble use mb_detect_encoding - it failed on an real file system to give correct answers which started of the investigation as some processing further down the stream broke due to this fact.
I tried using "brute force" on a linux-system: exec with the 'file' command. While this works well on real filesystems, of course it will not work out on vfs, causing almost all unit tests for that module to fail.
Resorted to use the fileinfo functions provided by the extension. However, while this one copes well with vfs://my/fileto/test.iso8859.txt - it reports us-ascii although the file is something different (eg. iso-8859 or utf-8).
Currently I do not have any further ideas and I think this is not expected behavior of vfsStream.
The text was updated successfully, but these errors were encountered:
Doing a couple very simple tests of this, it appears to work. Granted, these are very small sample strings and it might be possible that longer strings create an issue due to lack of support for multiple-byte strings. However, since you mentioned mb_detect_encoding() failed, it almost sounds like the file content may not be entirely the encoding you think it is.
OK, I rechecked and I think I misunderstood the internals of finfo and most likely also the mb_detect stuff: The testfiles we use mostly rely on the BOM at the beginning of the file (which is the reason for the problems further down the stream). We got those files as "examples" and included them in our tests (positive / negative). But we also have a test that just tries to use a malencoded file (converted to UTF-.32 which is not valid). I will do some further checking to find the details and a way to reproduce the errors. As I am not allowed to use the example files in public, I will have to setup some real synthetic ones, which is not to bad for testing.
Currently I have a project which requires some file analysis, for unit testing we use phpunit combined with vfsStream which works out pretty well most of the time.
However I encountered a problem today when trying to determine the filetype or the encoding of a file in the file-system.
Here is some insight on what I tried so far:
First I ran into trouble use mb_detect_encoding - it failed on an real file system to give correct answers which started of the investigation as some processing further down the stream broke due to this fact.
I tried using "brute force" on a linux-system: exec with the 'file' command. While this works well on real filesystems, of course it will not work out on vfs, causing almost all unit tests for that module to fail.
Resorted to use the fileinfo functions provided by the extension. However, while this one copes well with vfs://my/fileto/test.iso8859.txt - it reports us-ascii although the file is something different (eg. iso-8859 or utf-8).
Currently I do not have any further ideas and I think this is not expected behavior of vfsStream.
The text was updated successfully, but these errors were encountered: