Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mistakes in voicing assimilation #1

Open
jakubbortlik opened this issue Apr 12, 2021 · 3 comments
Open

mistakes in voicing assimilation #1

jakubbortlik opened this issue Apr 12, 2021 · 3 comments

Comments

@jakubbortlik
Copy link

Hi Lukáš,
just in case some non-Czech people read this I'm writing in English.
I've came upon your script and found some bugs. I don't know how to create a pull request here, so I forked your repo and made some changes in the branch "voicing_fix": https://github.com/jakubbortlik/phonetic-transcription/tree/voicing_fix. You might want to include the fix in your own code.
Apart from the mistakes in Czech, I also found some issues in the Polish transcription (some mistakes in voice assimilation of "rz" (just write a test for "przemek" which should be "p ʂ ɛ m ɛ k" not "b ʐ ɛ m ɛ k". Also, some place assimilation is not correct, e.g., "napęd" is not "n a p ɛ ŋ t" but rather "n a p ɛ n̪ t", i.e., the "n" is not velarized but dentalized, in fact the whole n_nasals list should be split into at least two different lists according to place of articulation, check out our overview of Polish pronunciation at http://polfon.upol.cz/ :).

@lukyjanek
Copy link
Owner

Hi Jakub,

Thank you very much for your comments! I apologize for the late response, I did not check the project for a long time.

I agree with you and accept the corrections you proposed. I fixed the bugs in Czech transription of "ch" and "x" in Czech and the assimilation of "rz" in Polish.

Your project on Polish pronunciation is great! Thank you for this reference. :-)

Unfortunatelly, the correction of place assimilation of Polish "ą" and "ę" remains open. I've studied the project you referred to, but, for me, the situation seems as catching exceptions of exceptions from exceptions... The current implemented solution in phon_polish.py is taken from "Mówię po polsku" by Zofia Tarajło-Lipowska (2008). If you have better ("more correct/precise") implementation, feel free to contact me. I am ready to work on improvements in these scripts.

@pkruczynski
Copy link

Final devoicing correctly preserved in the word Bóg (God):
echo -e "Bóg" | python3 phon_polish.py
b u k 👍

Omitting the end of the word Bóg and the resulting final devoicing in the context of the whole sentence:
echo -e "Bóg jest źródłem prawdziwej miłości, wiary i życia" | python3 phon_polish.py
b u ɡ 👎 j ɛ z d ʑ r u d w ɛ m p r a v d͡ʑ i v ɛ j m i w ɔ ɕ t͡ɕ | v i a r ɨ ʔ i ʐ ɨ t͡ɕ a

I found real bug ( ͡° ͜ʖ ͡°)

@adam-sierakowski
Copy link

Hello!

I am new to GitHub and I don't know if this is the right place to post this comment.

Just wanted to say that even though those issues are still around, your code was of much use for me. Even though I had to manually correct all the output, it saved me a ton of work. It was much faster than doing the entire transcription from scratch and for data protection reasons I couldn't use any of the online tools (but even if I did, I would have to proofread it too).

I would recommend your code to anyone in my situation, who is trained in Polish phonology and has the skill and time to proofread the output transcriptions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants