-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tagging errors (not in original Hunpos) #29
Comments
I used the old binary of v1.0 until now since it stopped working on macOS so I've rebuilt it and have noticed that in the "looked up the date" the "date" became a verb. Then I took the cache I preserved since using the previous version to compare: diff --git a/2.txt b/1.txt
index 573cf80..5c3aa7e 100644
--- a/2.txt
+++ b/1.txt
@@ -1,19 +1,20 @@
-( NNS
-remember VBP
+( VBZ
+remember VB
= SYM
-I PRP
+I NNP
looked VBD
-up RP
-the DT
-date NN
-in IN
+up IN
+the VBP
+date VB
+in RP
the DT
logs NNS
-and CC
-checked VBD
+and NNP
+checked VBN
which WDT
-comic NN
+comic JJ
I PRP
-referred VBD
-to TO
+referred VBN
+to JJ
) VB
+ "the date" -- "VBP VB" -- is that correct? Maybe I'm supposed to take some updated model file from somewhere? |
Hi all! Do you know the exact source code corresponding to the old binary? |
I have no idea about the source code of the old binary, I just downloaded it from https://code.google.com/archive/p/hunpos/downloads md5 en_wsj.model : f666dc61f7cbf3cc69366010a4e1f29f Maybe the upload date has some relation to code version. The new one was compiled without any issue following the instructions, after |
I am able to reproduce the issue with that model. |
Indeed by compiling current source code with an older OCaml version (It works until 3.12.1 and breaks starting from 4.00.0) "solved" the issue. I guess that #21 was only partially addressed and further investigation is needed. I do not know when I will have time to investigate the issue. In the meahwhile you can either retrain the model or compile with OCaml 3.12.1. Obviously, anyone willing to investigate and solve the issue is welcome. :-) |
Oh, cool. I just wonder how do I install specific version of OCaml on macOS. I just used |
Yes, it is called opam. From https://opam.ocaml.org/doc/Install.html I can see:
Once you have opam installed you can install specific versions following instructions at https://ocaml.org/docs/install.html. On a clean setup it should be something like:
|
Hmmm, by default the v4 is installed, then the latest v3 is 3.12.1 and:
|
You are right, I tested 3.12.1 and not 3.12.2. I changed my comments above to reflect this. As for the error you are experiencing, maybe you can open an issue either to opam or ocaml. I guess it should be possible to compile by setting some C compiler flags so that it does not fail due to this issue. Probably removing |
This version of hunpos behaves differently than the original compiled binary:
Original version (downloaded from https://code.google.com/archive/p/hunpos/downloads):
The original version gives the correct output: "och" is the most common Swedish conjunction (KN) and not a foreign word (UO). The language model is available from here: https://github.com/spraakbanken/sparv-models/raw/master/hunpos/suc3_suc-tags_default-setting_utf8.model (beware, the model is 14MB)
I compiled both on Mac OS Catalina, and on Devuan Linux, and it behaves the same on both platforms (i.e., gives the wrong postag for "och").
Note: there are problems with at least the folloing common Swedish conjunctions:
UO
instead ofKN
HA
instead ofKN
PL
instead ofSN
(subjunction)The text was updated successfully, but these errors were encountered: