Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing character 槑 (#584) #570

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

eit
Copy link

@eit eit commented Jan 12, 2025

Summary by Bito

Added support for Chinese character 楑 (mei2) to the character database by updating BPMFBase.txt with BPMF phonetic notation and phrase.occ with occurrence data. Enhancement aligns with Ministry of Education's Dictionary of Chinese Character Variants standards.

Unit tests added: False

Estimated effort to review (1-5, lower is better): 1

@xatier
Copy link
Contributor

xatier commented Jan 15, 2025

You would need to add this character into Source/Data/phrase.occ. Also, you must put the newly added character to the end of the list, i.e., after 攗 ㄇㄟˊ mei2 ao6 big5.

Ref:
https://github.com/openvanilla/McBopomofo/wiki/%E8%A9%9E%E5%BA%AB%E9%96%8B%E7%99%BC%E8%AA%AA%E6%98%8E
4aac3a2

@eit
Copy link
Author

eit commented Jan 19, 2025

I just finished updating phrase.occ and BPMFMappings.txt. Please check it out.

@zonble zonble self-assigned this Jan 20, 2025
@xatier
Copy link
Contributor

xatier commented Jan 20, 2025

You also need to update Source/Data/BPMFBase.txt, to put the newly added character to the bottom of the list. See my previous comment.

@eit eit changed the title add missing phonetics for 槑 add missing charactor and phonetics for 槑 Jan 22, 2025
@eit eit changed the title add missing charactor and phonetics for 槑 add missing character and phonetics for 槑 Jan 22, 2025
@eit
Copy link
Author

eit commented Jan 22, 2025

I have added the character to the BPMFBase.txt in my first commit. Should I move that to the bottom of the list?

@ChiahongHong
Copy link
Contributor

ChiahongHong commented Jan 22, 2025

Yes. Let me explain a bit. As mentioned above, please move the row after , because McBopomofo has a 傳統注音 mode, similar to ㄅ半 (舊注音). The candidate characters are fixed in the order of BPMFBase.txt, and users who are accustomed to this mode usually memorize the selection numbers rather than looking at the candidates in real-time. If you change the order, it will affect others' expectations of the characters that appear.

攗 ㄇㄟˊ mei2 ao6 big5
槑 ㄇㄟˊ mei2 ao6 big5

Also, I’m curious about what 吉甘槑喆 is. After searching, I found out it’s the name of a restaurant. I’m just curious why you would want to include it in the phrase list?

+ 吉甘槑喆 ㄐㄧˊ ㄍㄢ ㄇㄟˊ ㄓㄜˊ 

@eit
Copy link
Author

eit commented Jan 30, 2025

About the order in BPMFBase.txt => got it.
About 吉甘槑喆, I also just searched 槑 in the search engine and found this term. I thought it was a specific term like 鼎泰豐, so I added it to the list of phrases. What do you think?

@ChiahongHong
Copy link
Contributor

I don't have any particular preference, but I'm wondering if this is a well-known restaurant like 鼎泰豐 🤣

Of course, the final decision should respect the core maintainers' review.

@lukhnos
Copy link
Contributor

lukhnos commented Jan 31, 2025

增加缺字條目的順序有問題,這個 ChiahongHong 已經提出,就不贅述。

我的意見:

  1. 雖然已經發了 PR,還是請你開個 issue,說明缺字,以及這個字讀音的來源(請提供教育部字辭典的 URL)。
  2. 如果只是為了讓缺字有個詞條,而詞條並不是詞典收錄條目(例如商家名),請刪除該條目。我們一直以來對收錄缺字的態度比較寬鬆。缺詞的話有討論空間,但這個 PR 目前想加的條目跟「鼎泰豐」在中文裡的意義與地位真的不能相提並論。

@eit
Copy link
Author

eit commented Feb 1, 2025

Understand. I will finish that later.

@eit eit reopened this Feb 3, 2025
@eit eit changed the title add missing character and phonetics for 槑 Add missing character 槑 (#584) Feb 3, 2025
Copy link

bito-code-review bot commented Feb 3, 2025

Code Review Agent Run #36d69b

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 410fe2f..410fe2f
    • Source/Data/BPMFBase.txt
    • Source/Data/phrase.occ
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Copy link

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Character Database Enhancement

BPMFBase.txt - Added missing character 槑 with BPMF phonetic notation

phrase.occ - Added occurrence entry for character 槑

Copy link
Contributor

@xatier xatier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants